2 @comment node-name, next, previous, up
8 * Dynamic-extent allocation::
10 * Global and Always-Bound variables::
11 * Miscellaneous Efficiency Issues::
15 @comment node-name, next, previous, up
19 @subsection Structure object slot access
21 Structure slot accessors are efficient only if the compiler is able to
22 open code them: compiling a call to a structure slot accessor before
23 the structure is defined, declaring one @code{notinline}, or passing
24 it as a functional argument to another function causes severe
25 perfomance degradation.
27 @subsection Standard object slot access
29 The most efficient way to access a slot of a @code{standard-object} is
30 by using @code{slot-value} with a constant slot name argument inside a
31 @code{defmethod} body, where the variable holding the instance is a
32 specializer parameter of the method and is never assigned to. The cost
33 is roughly 1.6 times that of an open coded structure slot accessor.
35 Second most efficient way is to use a CLOS slot accessor, or
36 @code{slot-value} with a constant slot name argument, but in
37 circumstances other than specified above. This may be up to 3 times as
38 slow as the method described above.
43 (defclass foo () ((bar)))
45 ;; Fast: specializer and never assigned to
46 (defmethod quux ((foo foo) new)
47 (let ((old (slot-value foo 'bar)))
48 (setf (slot-value foo 'bar) new)
51 ;; Slow: not a specializer
52 (defmethod quux ((foo foo) new)
54 (old (slot-value temp 'bar)))
55 (setf (slot-value temp 'bar) new)
58 ;; Slow: assignment to FOO
59 (defmethod quux ((foo foo) new)
60 (let ((old (slot-value foo 'bar)))
61 (setf (slot-value foo 'bar) new)
66 Note that when profiling code such as this, the first few calls to the
67 generic function are not representative, as the dispatch mechanism is
68 lazily set up during those calls.
70 @node Dynamic-extent allocation
71 @comment node-name, next, previous, up
72 @section Dynamic-extent allocation
73 @cindex Dynamic-extent declaration
75 SBCL has limited support for performing allocation on the stack when a
76 variable is declared @code{dynamic-extent}. The @code{dynamic-extent}
77 declarations are not verified, but are simply trusted as long as
78 @code{sb-ext:*stack-allocate-dynamic-extent*} is true.
80 If dynamic extent constraints specified in the Common Lisp standard
81 are violated, the best that can happen is for the program to have
82 garbage in variables and return values; more commonly, the system will
85 @include var-sb-ext-star-stack-allocate-dynamic-extent-star.texinfo
87 There are many cases when @code{dynamic-extent} declarations could be
88 useful. At present, SBCL implements stack allocation for
93 @code{&rest} lists, when these are declared @code{dynamic-extent}.
96 @code{cons}, @code{list}, @code{list*}, and @code{vector} when the
97 result is bound to a variable declared @code{dynamic-extent}.
100 simple forms of @code{make-array}, whose result is bound to a variable
101 declared @code{dynamic-extent}: stack allocation is possible only if
102 the resulting array is known to be both simple and one-dimensional,
103 and has a constant @code{:element-type}.
105 @strong{Note}: stack space is limited, so allocation of a large vector
106 may cause stack overflow. For this reason potentially large vectors,
107 which might circumvent stack overflow detection, are stack allocated
108 only in zero @code{safety} policies.
111 closures defined with @code{flet} or @code{labels}, with a bound
112 @code{dynamic-extent} declaration. Closed-over variables, which are
113 assigned to (either inside or outside the closure) are still allocated
114 on the heap. Blocks and tags are also allocated on the heap, unless
115 all non-local control transfers to them are compiled with zero
119 user-defined structures when the structure constructor defined using
120 @code{defstruct} has been declared @code{inline} and the result of the
121 call to the constructor is bound to a variable declared
122 @code{dynamic-extent}.
124 @strong{Note:} structures with ``raw'' slots can currently be
125 stack-allocated only on x86 and x86-64.
128 all of the above when they appear as initial parts of another
129 stack-allocated object.
136 ;;; Declaiming a structure constructor inline before definition makes
137 ;;; stack allocation possible.
138 (declaim (inline make-thing))
139 (defstruct thing obj next)
141 ;;; Stack allocation of various objects bound to DYNAMIC-EXTENT
143 (let* ((list (list 1 2 3))
144 (nested (cons (list 1 2) (list* 3 4 (list 5))))
145 (vector (make-array 3 :element-type 'single-float))
146 (thing (make-thing :obj list
147 :next (make-thing :obj (make-array 3)))))
148 (declare (dynamic-extent list nested vector thing))
151 ;;; Stack allocation of arguments to a local function is equivalent
152 ;;; to stack allocation of local variable values.
154 (declare (dynamic-extent x))
158 (f (cons (cons 1 2) (cons 3 4)))
161 ;;; Stack allocation of &REST lists
162 (defun foo (&rest args)
163 (declare (dynamic-extent args))
172 Stack allocation of assigned-to closed-over variables, where these are
173 declared @code{dynamic-extent};
176 Automatic detection of the common idiom of applying a function to some
177 defaults and a @code{&rest} list, even when this is not declared
178 @code{dynamic-extent};
181 Automatic detection of the common idiom of calling quantifiers with a
182 closure, even when the closure is not declared @code{dynamic-extent}.
186 @node Modular arithmetic
187 @comment node-name, next, previous, up
188 @section Modular arithmetic
189 @cindex Modular arithmetic
190 @cindex Arithmetic, modular
191 @cindex Arithmetic, hardware
193 Some numeric functions have a property: @var{N} lower bits of the
194 result depend only on @var{N} lower bits of (all or some)
195 arguments. If the compiler sees an expression of form @code{(logand
196 @var{exp} @var{mask})}, where @var{exp} is a tree of such ``good''
197 functions and @var{mask} is known to be of type @code{(unsigned-byte
198 @var{w})}, where @var{w} is a ``good'' width, all intermediate results
199 will be cut to @var{w} bits (but it is not done for variables and
200 constants!). This often results in an ability to use simple machine
201 instructions for the functions.
207 (declare (type (unsigned-byte 32) x y))
208 (ldb (byte 32 0) (logxor x (lognot y))))
211 The result of @code{(lognot y)} will be negative and of type
212 @code{(signed-byte 33)}, so a naive implementation on a 32-bit
213 platform is unable to use 32-bit arithmetic here. But modular
214 arithmetic optimizer is able to do it: because the result is cut down
215 to 32 bits, the compiler will replace @code{logxor} and @code{lognot}
216 with versions cutting results to 32 bits, and because terminals
217 (here---expressions @code{x} and @code{y}) are also of type
218 @code{(unsigned-byte 32)}, 32-bit machine arithmetic can be used.
220 As of SBCL 0.8.5 ``good'' functions are @code{+}, @code{-};
221 @code{logand}, @code{logior}, @code{logxor}, @code{lognot} and their
222 combinations; and @code{ash} with the positive second
223 argument. ``Good'' widths are 32 on HPPA, MIPS, PPC, Sparc and x86 and
224 64 on Alpha. While it is possible to support smaller widths as well,
225 currently this is not implemented.
227 @node Global and Always-Bound variables
228 @comment node-name, next, previous, up
229 @section Global and Always-Bound variables
231 @include macro-sb-ext-defglobal.texinfo
233 @deftp {Declaration} sb-ext:global
235 Syntax: @code{(sb-ext:global symbol*)}
237 Only valid as a global proclamation.
239 Specifies that the named symbols cannot be proclaimed or locally
240 declared @code{special}. Proclaiming an already special or constant
241 variable name as @code{global} signal an error. Allows more efficient
242 value lookup in threaded environments in addition to expressing
243 programmer intention.
246 @deftp {Declaration} sb-ext:always-bound
248 Syntax: @code{(sb-ext:always-bound symbol*)}
250 Only valid as a global proclamation.
252 Specifies that the named symbols is always bound. Inhibits @code{makunbound}
253 of the named symbols. Proclaiming an unbound symbol as @code{always-bound} signals
254 an error. Allows compiler to elide boundness checks from value lookups.
257 @node Miscellaneous Efficiency Issues
258 @comment node-name, next, previous, up
259 @section Miscellaneous Efficiency Issues
261 FIXME: The material in the CMUCL manual about getting good
262 performance from the compiler should be reviewed, reformatted in
263 Texinfo, lightly edited for SBCL, and substituted into this
264 manual. In the meantime, the original CMUCL manual is still 95+%
265 correct for the SBCL version of the Python compiler. See the
269 @item Advanced Compiler Use and Efficiency Hints
270 @item Advanced Compiler Introduction
271 @item More About Types in Python
273 @item Source Optimization
276 @item Block Compilation
277 @item Inline Expansion
278 @item Object Representation
280 @item General Efficiency Hints
281 @item Efficiency Notes
284 Besides this information from the CMUCL manual, there are a few other
285 points to keep in mind.
290 The CMUCL manual doesn't seem to state it explicitly, but Python has a
291 mental block about type inference when assignment is involved. Python
292 is very aggressive and clever about inferring the types of values
293 bound with @code{let}, @code{let*}, inline function call, and so
294 forth. However, it's much more passive and dumb about inferring the
295 types of values assigned with @code{setq}, @code{setf}, and
296 friends. It would be nice to fix this, but in the meantime don't
297 expect that just because it's very smart about types in most respects
298 it will be smart about types involved in assignments. (This doesn't
299 affect its ability to benefit from explicit type declarations
300 involving the assigned variables, only its ability to get by without
301 explicit type declarations.)
303 @c <!-- FIXME: Python dislikes assignments, but not in type
304 @c inference. The real problems are loop induction, closed over
305 @c variables and aliases. -->
308 Since the time the CMUCL manual was written, CMUCL (and thus SBCL) has
309 gotten a generational garbage collector. This means that there are
310 some efficiency implications of various patterns of memory usage which
311 aren't discussed in the CMUCL manual. (Some new material should be
315 SBCL has some important known efficiency problems. Perhaps the most
321 The garbage collector is not particularly efficient, at least on
322 platforms without the generational collector (as of SBCL 0.8.9, all
326 Various aspects of the PCL implementation of CLOS are more inefficient
333 Finally, note that Common Lisp defines many constructs which, in
334 the infamous phrase, ``could be compiled efficiently by a
335 sufficiently smart compiler''. The phrase is infamous because
336 making a compiler which actually is sufficiently smart to find all
337 these optimizations systematically is well beyond the state of the art
338 of current compiler technology. Instead, they're optimized on a
339 case-by-case basis by hand-written code, or not optimized at all if
340 the appropriate case hasn't been hand-coded. Some cases where no such
341 hand-coding has been done as of SBCL version 0.6.3 include
346 @code{(reduce #'f x)} where the type of @code{x} is known at compile
350 various bit vector operations, e.g. @code{(position 0
354 specialized sequence idioms, e.g. @code{(remove item list :count 1)}
357 cases where local compilation policy does not require excessive type
358 checking, e.g. @code{(locally (declare (safety 1)) (assoc item
359 list))} (which currently performs safe @code{endp} checking internal
364 If your system's performance is suffering because of some construct
365 which could in principle be compiled efficiently, but which the SBCL
366 compiler can't in practice compile efficiently, consider writing a
367 patch to the compiler and submitting it for inclusion in the main
368 sources. Such code is often reasonably straightforward to write;
369 search the sources for the string ``@code{deftransform}'' to find many
370 examples (some straightforward, some less so).