+#1
(defun mysl (s)
(declare (simple-string s))
(declare (optimize (speed 3) (safety 0) (debug 0)))
* On X86 I is represented as a tagged integer.
-* EQL uses "CMP reg,reg" instead of "CMP reg,im". This causes
- allocation of an extra register and an extra move.
-
* Unnecessary move:
3: SLOT S!11[EDX] {SB-C::VECTOR-LENGTH 1 7} => t23[EAX]
4: MOVE t23[EAX] => t24[EBX]
--------------------------------------------------------------------------------
+#2
(defun quux (v)
(declare (optimize (speed 3) (safety 0) (space 2) (debug 0)))
(declare (type (simple-array double-float 1) v))
and emits costy MOVE ... => FR1.
--------------------------------------------------------------------------------
+#3
(defun bar (n)
(declare (optimize (speed 3) (safety 0) (space 2))
(type fixnum n))
* IR1 does not optimize away (MAKE-LIST N).
--------------------------------------------------------------------------------
+#4
(defun bar (v1 v2)
(declare (optimize (speed 3) (safety 0) (space 2))
(type (simple-array base-char 1) v1 v2))
* And why two moves?
--------------------------------------------------------------------------------
-(loop repeat 1.5)
-
-uses generic arithmetic
---------------------------------------------------------------------------------
+#6
09:49:05 <jtra> I have found a case in those where suboptimal code is
generate with nested loops, it might be moderately easy to fix that
09:49:28 <jtra> see
memory location for iteration variable
;;; -*- mode: lisp -*-
-;;; $Id$
;;; http://www.bagley.org/~doug/shootout/
;;; from Friedrich Dominicus
(incf x)))))))
(format t "~A~%" x)))
--------------------------------------------------------------------------------
-(defun foo (x)
- (declare (optimize speed (debug 0)))
- (if (< x 0) x (foo (1- x))))
-
-SBCL generates a full call of FOO (but CMUCL does not).
---------------------------------------------------------------------------------
+#8
(defun foo (d)
(declare (optimize (speed 3) (safety 0) (debug 0)))
(declare (type (double-float 0d0 1d0) d))
(loop for i fixnum from 1 to 5
- for x1 double-float = (sin d) ;;; !!!
- do (loop for j fixnum from 1 to 4
- sum x1 double-float)))
+ for x1 double-float = (sin d) ;;; !!!
+ do (loop for j fixnum from 1 to 4
+ sum x1 double-float)))
Without the marked declaration Python will use boxed representation for X1.
DOUBLE-FLOAT. Unhopefully, IR1 does not optimize away effectless
SETs/bindings, and IR2 does not perform type inference.
--------------------------------------------------------------------------------
+#9 "Multi-path constant folding"
(defun foo (x)
(if (= (cond ((irgh x) 0)
((buh x) 1)
((buh x) :no)
(t :no)))
--------------------------------------------------------------------------------
+#11
+(inverted variant of #9)
+
+(lambda (x)
+ (let ((y (sap-alien x c-string)))
+ (list (alien-sap y)
+ (alien-sap y))))
+
+It could be optimized to
+
+(lambda (x) (list x x))
+
+(if Y were used only once, the current compiler would optimize it)
+--------------------------------------------------------------------------------
+#12
+(typep (truly-the (simple-array * (*)) x) 'simple-vector)
+
+tests lowtag.
+--------------------------------------------------------------------------------
+#13
+FAST-+/FIXNUM and similar should accept unboxed arguments in interests
+of representation selection. Problem: inter-TN dependencies.
+--------------------------------------------------------------------------------
+#14
+The derived type of (/ (THE (DOUBLE-FLOAT (0D0)) X) (THE (DOUBLE-FLOAT
+1D0) Y)) is (DOUBLE-FLOAT 0.0d0). While it might be reasonable, it is
+better to derive (OR (MEMBER 0.0d0) (DOUBLE-FLOAT (0.0d0))).
+--------------------------------------------------------------------------------
+#15
+On the alpha, the system is reluctant to refer directly to a constant bignum,
+preferring to load a large constant through a slow sequence of instructions,
+then cons up a bignum for it:
+
+(LAMBDA (A)
+ (DECLARE (OPTIMIZE (SAFETY 1) (SPEED 3) (DEBUG 1))
+ (TYPE (INTEGER -10000 10000) A)
+ (IGNORABLE A))
+ (CASE A
+ ((89 125 16) (ASH A (MIN 18 -706)))
+ (T (DPB -3 (BYTE 30 30) -1))))
+--------------------------------------------------------------------------------