1 @node Efficiency, Beyond The ANSI Standard, The Debugger, Top
2 @comment node-name, next, previous, up
5 FIXME: The material in the CMUCL manual about getting good
6 performance from the compiler should be reviewed, reformatted in
7 Texinfo, lightly edited for SBCL, and substituted into this
8 manual. In the meantime, the original CMUCL manual is still 95+%
9 correct for the SBCL version of the Python compiler. See the
13 @item Advanced Compiler Use and Efficiency Hints
14 @item Advanced Compiler Introduction
15 @item More About Types in Python
17 @item Source Optimization
20 @item Block Compilation
21 @item Inline Expansion
22 @item Object Representation
24 @item General Efficiency Hints
25 @item Efficiency Notes
28 Besides this information from the CMUCL manual, there are a few other
29 points to keep in mind.
34 The CMUCL manual doesn't seem to state it explicitly, but Python has a
35 mental block about type inference when assignment is involved. Python
36 is very aggressive and clever about inferring the types of values
37 bound with @code{let}, @code{let*}, inline function call, and so
38 forth. However, it's much more passive and dumb about inferring the
39 types of values assigned with @code{setq}, @code{setf}, and
40 friends. It would be nice to fix this, but in the meantime don't
41 expect that just because it's very smart about types in most respects
42 it will be smart about types involved in assignments. (This doesn't
43 affect its ability to benefit from explicit type declarations
44 involving the assigned variables, only its ability to get by without
45 explicit type declarations.)
47 @c <!-- FIXME: Python dislikes assignments, but not in type
48 @c inference. The real problems are loop induction, closed over
49 @c variables and aliases. -->
52 Since the time the CMUCL manual was written, CMUCL (and thus SBCL) has
53 gotten a generational garbage collector. This means that there are
54 some efficiency implications of various patterns of memory usage which
55 aren't discussed in the CMUCL manual. (Some new material should be
59 SBCL has some important known efficiency problems. Perhaps the most
65 There is only limited support for the ANSI @code{dynamic-extent}
66 declaration. @xref{Dynamic-extent allocation}
69 The garbage collector is not particularly efficient, at least on
70 platforms without the generational collector (as of SBCL 0.8.9, all
74 Various aspects of the PCL implementation of CLOS are more inefficient
81 Finally, note that Common Lisp defines many constructs which, in
82 the infamous phrase, ``could be compiled efficiently by a
83 sufficiently smart compiler''. The phrase is infamous because
84 making a compiler which actually is sufficiently smart to find all
85 these optimizations systematically is well beyond the state of the art
86 of current compiler technology. Instead, they're optimized on a
87 case-by-case basis by hand-written code, or not optimized at all if
88 the appropriate case hasn't been hand-coded. Some cases where no such
89 hand-coding has been done as of SBCL version 0.6.3 include
94 @code{(reduce #'f x)} where the type of @code{x} is known at compile
98 various bit vector operations, e.g. @code{(position 0
102 specialized sequence idioms, e.g. @code{(remove item list :count 1)}
105 cases where local compilation policy does not require excessive type
106 checking, e.g. @code{(locally (declare (safety 1)) (assoc item
107 list))} (which currently performs safe @code{endp} checking internal
112 If your system's performance is suffering because of some construct
113 which could in principle be compiled efficiently, but which the SBCL
114 compiler can't in practice compile efficiently, consider writing a
115 patch to the compiler and submitting it for inclusion in the main
116 sources. Such code is often reasonably straightforward to write;
117 search the sources for the string ``@code{deftransform}'' to find many
118 examples (some straightforward, some less so).
121 * Dynamic-extent allocation::
122 * Modular arithmetic::
125 @node Dynamic-extent allocation, Modular arithmetic, Efficiency, Efficiency
126 @comment node-name, next, previous, up
127 @section Dynamic-extent allocation
128 @cindex Dynamic-extent declaration
130 SBCL has limited support for performing allocation on the stack when a
131 variable is declared @code{dynamic-extent}. The @code{dynamic-extent}
132 declarations are not verified, but are simply trusted; if the
133 constraints in the Common Lisp standard are violated, the best that
134 can happen is for the program to have garbage in variables and return
135 values; more commonly, the system will crash.
137 As a consequence of this, the condition for performing stack
138 allocation is stringent: either of the @code{speed} or @code{space}
139 optimization qualities must be higher than the maximum of
140 @code{safety} and @code{debug} at the point of the allocation. For
145 (declare (optimize speed (safety 1) (debug 1)))
146 (defun foo (&rest rest)
147 (declare (dynamic-extent rest))
151 Here the @code{&rest} list will be allocated on the stack. Note that
152 it would not be in the following situation:
155 (defun foo (&rest rest)
156 (declare (optimize speed (safety 1) (debug 1)))
157 (declare (dynamic-extent rest))
161 because the @code{optimize} declaration affects the binding, not the
164 There are many cases when dynamic-extent declarations could be useful.
165 At present, SBCL implements
170 Stack allocation of @code{&rest} lists, where these are declared
171 @code{dynamic-extent}.
180 Stack allocation of closures, where these are declared
181 @code{dynamic-extent};
184 Stack allocation of @code{list}, @code{list*} and @code{cons}
185 (including following chains during initialization, and also for
186 binding mutation), where the allocation is declared
187 @code{dynamic-extent};
190 Automatic detection of the common idiom of applying a function to some
191 defaults and a @code{&rest} list, even when this is not declared
192 @code{dynamic-extent};
195 Automatic detection of the common idiom of calling quantifiers with a
196 closure, even when the closure is not declared @code{dynamic-extent}.
200 @node Modular arithmetic, , Dynamic-extent allocation, Efficiency
201 @comment node-name, next, previous, up
202 @section Modular arithmetic
203 @cindex Modular arithmetic
204 @cindex Arithmetic, modular
205 @cindex Arithmetic, hardware
207 Some numeric functions have a property: @var{N} lower bits of the
208 result depend only on @var{N} lower bits of (all or some)
209 arguments. If the compiler sees an expression of form @code{(logand
210 @var{exp} @var{mask})}, where @var{exp} is a tree of such ``good''
211 functions and @var{mask} is known to be of type @code{(unsigned-byte
212 @var{w})}, where @var{w} is a ``good'' width, all intermediate results
213 will be cut to @var{w} bits (but it is not done for variables and
214 constants!). This often results in an ability to use simple machine
215 instructions for the functions.
221 (declare (type (unsigned-byte 32) x y))
222 (ldb (byte 32 0) (logxor x (lognot y))))
225 The result of @code{(lognot y)} will be negative and of type
226 @code{(signed-byte 33)}, so a naive implementation on a 32-bit
227 platform is unable to use 32-bit arithmetic here. But modular
228 arithmetic optimizer is able to do it: because the result is cut down
229 to 32 bits, the compiler will replace @code{logxor} and @code{lognot}
230 with versions cutting results to 32 bits, and because terminals
231 (here---expressions @code{x} and @code{y}) are also of type
232 @code{(unsigned-byte 32)}, 32-bit machine arithmetic can be used.
235 As of SBCL 0.8.5 ``good'' functions are @code{+}, @code{-};
236 @code{logand}, @code{logior}, @code{logxor}, @code{lognot} and their
237 combinations; and @code{ash} with the positive second
238 argument. ``Good'' widths are 32 on HPPA, MIPS, PPC, Sparc and X86 and
239 64 on Alpha. While it is possible to support smaller widths as well,
240 currently it is not implemented.