2 (declare (simple-string s))
3 (declare (optimize (speed 3) (safety 0) (debug 0)))
6 (dotimes (i (length s))
7 (when (eql (aref s i) #\1)
11 * On X86 I is represented as a tagged integer.
13 * EQL uses "CMP reg,reg" instead of "CMP reg,im". This causes
14 allocation of an extra register and an extra move.
17 3: SLOT S!11[EDX] {SB-C::VECTOR-LENGTH 1 7} => t23[EAX]
18 4: MOVE t23[EAX] => t24[EBX]
20 --------------------------------------------------------------------------------
22 (declare (optimize (speed 3) (safety 0) (space 2) (debug 0)))
23 (declare (type (simple-array double-float 1) v))
25 (declare (type double-float s))
26 (dotimes (i (length v))
27 (setq s (+ s (aref v i))))
30 * Python does not combine + with AREF, so generates extra move and
33 * On X86 Python thinks that all FP registers are directly accessible
34 and emits costy MOVE ... => FR1.
36 --------------------------------------------------------------------------------
38 (declare (optimize (speed 3) (safety 0) (space 2))
40 (let ((v (make-list n)))
41 (setq v (make-array n))
44 * IR1 does not optimize away (MAKE-LIST N).
46 * IR1 thinks that the type of V in (LENGTH V) is (OR LIST SIMPLE-VECTOR), not
48 --------------------------------------------------------------------------------
50 (declare (optimize (speed 3) (safety 0) (space 2))
51 (type (simple-array base-char 1) v1 v2))
52 (dotimes (i (length v1))
53 (setf (aref v2 i) (aref v1 i))))
55 VOP DATA-VECTOR-SET/SIMPLE-STRING V2!14[EDI] t32[EAX] t30[S2]>t33[CL]
57 MOV #<TN t33[CL]>, #<TN t30[S2]>
58 MOV BYTE PTR [EDI+EAX+1], #<TN t33[CL]>
59 MOV #<TN t35[AL]>, #<TN t33[CL]>
60 MOV #<TN t34[S2]>, #<TN t35[AL]>
62 * The value of DATA-VECTOR-SET is not used, so there is no need in the
66 --------------------------------------------------------------------------------
69 uses generic arithmetic
70 --------------------------------------------------------------------------------
71 09:49:05 <jtra> I have found a case in those where suboptimal code is
72 generate with nested loops, it might be moderately easy to fix that
74 http://www.bagley.org/~doug/shootout/bench/nestedloop/nestedloop.cmucl
75 09:50:30 <jtra> if you add declarations to dotimes, generated code is
76 almost optimal, but most inner loops run out of registers and use
77 memory location for iteration variable
79 ;;; -*- mode: lisp -*-
81 ;;; http://www.bagley.org/~doug/shootout/
82 ;;; from Friedrich Dominicus
85 (let ((n (parse-integer (or (car (last extensions:*command-line-strings*)) "1")))
89 (optimize (speed 3) (debug 0) (safety 0)))
98 --------------------------------------------------------------------------------