2 @comment node-name, next, previous, up
7 * Dynamic-extent allocation::
9 * Miscellaneous Efficiency Issues::
12 @node Dynamic-extent allocation
13 @comment node-name, next, previous, up
14 @section Dynamic-extent allocation
15 @cindex Dynamic-extent declaration
17 SBCL has limited support for performing allocation on the stack when a
18 variable is declared @code{dynamic-extent}. The @code{dynamic-extent}
19 declarations are not verified, but are simply trusted as long as
20 @code{sb-ext:*stack-allocate-dynamic-extent*} is true.
22 If dynamic extent constraints specified in the Common Lisp standard
23 are violated, the best that can happen is for the program to have
24 garbage in variables and return values; more commonly, the system will
27 @include var-sb-ext-star-stack-allocate-dynamic-extent-star.texinfo
29 There are many cases when @code{dynamic-extent} declarations could be
30 useful. At present, SBCL implements stack allocation for
35 @code{&rest} lists, when these are declared @code{dynamic-extent}.
38 @code{cons}, @code{list} and @code{list*}, when the result is bound to
39 a variable declared @code{dynamic-extent}.
42 simple forms of @code{make-array}, whose result is bound to a variable
43 declared @code{dynamic-extent}: stack allocation is possible only if
44 the resulting array is one-dimensional, and the call has no keyword
45 arguments with the exception of @code{:element-type}.
47 @strong{Note}: stack space is limited, so allocation of a large vector
48 may cause stack overflow. For this reason potentially large vectors,
49 which might circumvent stack overflow detection, are stack allocated
50 only in zero @code{safety} policies.
53 closures defined with @code{flet} or @code{labels}, with a bound
54 @code{dynamic-extent} declaration. Closed-over variables, which are
55 assigned to (either inside or outside the closure) are still allocated
56 on the heap. Blocks and tags are also allocated on the heap, unless
57 all non-local control transfers to them are compiled with zero
61 user-defined structures when the structure constructor defined using
62 @code{defstruct} has been declared @code{inline} and the result of the
63 call to the constructor is bound to a variable declared
64 @code{dynamic-extent}.
66 @strong{Note:} structures with ``raw'' slots can currently be
67 stack-allocated only on x86 and x86-64.
70 all of the above when they appear as initial parts if another
71 stack-allocated object.
78 ;;; Declaiming a structure constructor inline before definition makes
79 ;;; stack allocation possible.
80 (declaim (inline make-thing))
81 (defstruct thing obj next)
83 ;;; Stack allocation of various objects bound to DYNAMIC-EXTENT
85 (let* ((list (list 1 2 3))
86 (nested (cons (list 1 2) (list* 3 4 (list 5))))
87 (vector (make-array 3 :element-type 'single-float))
88 (thing (make-thing :obj list
89 :next (make-thing :obj (make-array 3)))))
90 (declare (dynamic-extent list nested vector thing))
93 ;;; Stack allocation of arguments to a local function is equivalent
94 ;;; to stack allocation of local variable values.
96 (declare (dynamic-extent x))
100 (f (cons (cons 1 2) (cons 3 4)))
103 ;;; Stack allocation of &REST lists
104 (defun foo (&rest args)
105 (declare (dynamic-extent args))
114 Stack allocation of assigned-to closed-over variables, where these are
115 declared @code{dynamic-extent};
118 Automatic detection of the common idiom of applying a function to some
119 defaults and a @code{&rest} list, even when this is not declared
120 @code{dynamic-extent};
123 Automatic detection of the common idiom of calling quantifiers with a
124 closure, even when the closure is not declared @code{dynamic-extent}.
128 @node Modular arithmetic
129 @comment node-name, next, previous, up
130 @section Modular arithmetic
131 @cindex Modular arithmetic
132 @cindex Arithmetic, modular
133 @cindex Arithmetic, hardware
135 Some numeric functions have a property: @var{N} lower bits of the
136 result depend only on @var{N} lower bits of (all or some)
137 arguments. If the compiler sees an expression of form @code{(logand
138 @var{exp} @var{mask})}, where @var{exp} is a tree of such ``good''
139 functions and @var{mask} is known to be of type @code{(unsigned-byte
140 @var{w})}, where @var{w} is a ``good'' width, all intermediate results
141 will be cut to @var{w} bits (but it is not done for variables and
142 constants!). This often results in an ability to use simple machine
143 instructions for the functions.
149 (declare (type (unsigned-byte 32) x y))
150 (ldb (byte 32 0) (logxor x (lognot y))))
153 The result of @code{(lognot y)} will be negative and of type
154 @code{(signed-byte 33)}, so a naive implementation on a 32-bit
155 platform is unable to use 32-bit arithmetic here. But modular
156 arithmetic optimizer is able to do it: because the result is cut down
157 to 32 bits, the compiler will replace @code{logxor} and @code{lognot}
158 with versions cutting results to 32 bits, and because terminals
159 (here---expressions @code{x} and @code{y}) are also of type
160 @code{(unsigned-byte 32)}, 32-bit machine arithmetic can be used.
162 As of SBCL 0.8.5 ``good'' functions are @code{+}, @code{-};
163 @code{logand}, @code{logior}, @code{logxor}, @code{lognot} and their
164 combinations; and @code{ash} with the positive second
165 argument. ``Good'' widths are 32 on HPPA, MIPS, PPC, Sparc and x86 and
166 64 on Alpha. While it is possible to support smaller widths as well,
167 currently this is not implemented.
169 @node Miscellaneous Efficiency Issues
170 @comment node-name, next, previous, up
171 @section Miscellaneous Efficiency Issues
173 FIXME: The material in the CMUCL manual about getting good
174 performance from the compiler should be reviewed, reformatted in
175 Texinfo, lightly edited for SBCL, and substituted into this
176 manual. In the meantime, the original CMUCL manual is still 95+%
177 correct for the SBCL version of the Python compiler. See the
181 @item Advanced Compiler Use and Efficiency Hints
182 @item Advanced Compiler Introduction
183 @item More About Types in Python
185 @item Source Optimization
188 @item Block Compilation
189 @item Inline Expansion
190 @item Object Representation
192 @item General Efficiency Hints
193 @item Efficiency Notes
196 Besides this information from the CMUCL manual, there are a few other
197 points to keep in mind.
202 The CMUCL manual doesn't seem to state it explicitly, but Python has a
203 mental block about type inference when assignment is involved. Python
204 is very aggressive and clever about inferring the types of values
205 bound with @code{let}, @code{let*}, inline function call, and so
206 forth. However, it's much more passive and dumb about inferring the
207 types of values assigned with @code{setq}, @code{setf}, and
208 friends. It would be nice to fix this, but in the meantime don't
209 expect that just because it's very smart about types in most respects
210 it will be smart about types involved in assignments. (This doesn't
211 affect its ability to benefit from explicit type declarations
212 involving the assigned variables, only its ability to get by without
213 explicit type declarations.)
215 @c <!-- FIXME: Python dislikes assignments, but not in type
216 @c inference. The real problems are loop induction, closed over
217 @c variables and aliases. -->
220 Since the time the CMUCL manual was written, CMUCL (and thus SBCL) has
221 gotten a generational garbage collector. This means that there are
222 some efficiency implications of various patterns of memory usage which
223 aren't discussed in the CMUCL manual. (Some new material should be
227 SBCL has some important known efficiency problems. Perhaps the most
233 The garbage collector is not particularly efficient, at least on
234 platforms without the generational collector (as of SBCL 0.8.9, all
238 Various aspects of the PCL implementation of CLOS are more inefficient
245 Finally, note that Common Lisp defines many constructs which, in
246 the infamous phrase, ``could be compiled efficiently by a
247 sufficiently smart compiler''. The phrase is infamous because
248 making a compiler which actually is sufficiently smart to find all
249 these optimizations systematically is well beyond the state of the art
250 of current compiler technology. Instead, they're optimized on a
251 case-by-case basis by hand-written code, or not optimized at all if
252 the appropriate case hasn't been hand-coded. Some cases where no such
253 hand-coding has been done as of SBCL version 0.6.3 include
258 @code{(reduce #'f x)} where the type of @code{x} is known at compile
262 various bit vector operations, e.g. @code{(position 0
266 specialized sequence idioms, e.g. @code{(remove item list :count 1)}
269 cases where local compilation policy does not require excessive type
270 checking, e.g. @code{(locally (declare (safety 1)) (assoc item
271 list))} (which currently performs safe @code{endp} checking internal
276 If your system's performance is suffering because of some construct
277 which could in principle be compiled efficiently, but which the SBCL
278 compiler can't in practice compile efficiently, consider writing a
279 patch to the compiler and submitting it for inclusion in the main
280 sources. Such code is often reasonably straightforward to write;
281 search the sources for the string ``@code{deftransform}'' to find many
282 examples (some straightforward, some less so).