1 \chapter{Introduction} % -*- Dictionary: design -*-
4 (with-open-file (s f :direction :output :if-exists :supersede)
5 (maphash \#'(lambda (k v)
8 (c::backend-template-names c::*backend*))))
11 \section{Scope and Purpose}
13 This document describes the Virtual Machine that serves as the basis for the
14 portable implementation of \ccl. The Virtual Machine (hereafter referred to as
15 the VM) provides a layer of abstraction that hides low-level details of
16 hardware and implementation strategy, while still revealing enough of the
17 implementation so that most of the system can be written at the VM level or
22 {\#\#\# Shouldn't specify VOPs. Instead, should specify which \clisp functions
23 are primitive and which subprimitives exist. Isn't really anyone's business
24 which VOPs actually exist. Each primitive function or subprimitive is
25 implemented either as a VOP or as expansion into Lisp code, at the particular
26 implementation's discretion.
28 From this point of view, the document is expressing the contract that the Lisp
29 level code outside of the compiler must satisfy. All functions must ultimately
30 be defined in terms of primitive functions and sub-primitives.
32 The responsibility of the compiler is to implement these primitive operations,
33 and also to implement special forms, variables and function calling.
35 VOPs emitted by the hard-wired translators for non-function nodes are a
36 somewhat different story. Each implementation will presumably implement all
37 these VOPs in order to avoid having to rewrite IR2 translation. We also need
38 to spend quite a bit of time discussing the semantics of these operations,
39 since they don't just correspond to some \clisp function with type constraints.
52 special binding/unbinding
57 Basic conditionals: EQ, ...
58 Interface to generation of other conditional VOPs.
60 Some VOPs don't need to be implemented at all:
61 VOPs to delimit the lifetimes of big stack TNs such as catch blocks
62 Others? Move VOPs might be defined in terms of an implementation supplied
63 move routine, since we probably also need this info outside of VOP generators
64 so that implicit moves can be generated.
67 Type testing/checking (somehow)
71 What this document talks about:
73 Interface between compiler front-end and back end. (VOPs)
74 Primitive \clisp operations directly supported by the VM.
75 Support for complex language features such as function call.
77 Sub-primitives that allow system code to do things not possible in \clisp.
79 Descriptions of how the current \ccl system uses VM facilities, especially
80 non-standard ones accessed through sub-primitives.
82 Notes about known portability problems.
84 Guidelines for writing portable \ccl system code. To some degree these
85 guidelines are implied by statements that certain things are true of \ccl
88 Descriptions of data structures that are not directly used by the VM, such as
89 debug information and Core files.
91 Descriptions of data structures that are directly used by the VM, such as
97 People who want to port \ccl.
98 People who want to understand the compiler.
99 People who want to understand how \ccl works.
100 People who need to write portable \ccl system code.
101 People such as debugger writers who need to access \ccl\t()'s internal data
106 Tell you things that are obviously implementation dependent, such as type
107 systems or memory management disciplines. See the the various implementation
110 Tell you only what you need to know. Programmers shouldn't exploit properties
111 of the VM documented here unless there is no way to do the same thing in
114 Tell you how the compiler works. In order to understand some of the subtleties
115 of VOP descriptions, you will have to understand the IR2 representation and how
116 it fits into the rest of the compiler.
118 Tell you anything about \clisp semantics. When some part of the VM has a
119 direct relationship to \clisp semantics, the relationship will be directly
120 stated using \clisp terminology, since a restatement of the semantics is likely
121 to be inaccurate or misleading. Exceptions will be made only when some
122 implication of the \clisp semantics is non-obvious.
124 Tell you everything about how \ccl works. This document only offers
125 information that is likely to be needed by programmers doing a port or writing
126 system code; portable, self-contained parts of the system are totally ignored.
127 This document deliberately avoids replicating information that is easily
128 available in the system sources, since such replicated information is always
129 incorrect somewhere. In some cases, a forwarding pointer to the appropriate
130 source will be given.
133 Things the VM won't do:
135 The VM specification does not totally solve the problem of porting \ccl, since
136 it is inevitable that it will not map cleanly to all possible combinations of
137 hardware and operating systems. The VM should not be regarded as being cast in
138 concrete, since changes in many characteristics would only affect a tiny
139 fraction of the system sources.
141 One current major problem with porting is that large pieces of functionality
142 are entirely within the VM, and would need to be reimplemented for each port.
143 A major goal for future work on the system is moving code out of the VM, both
144 by supporting a "fast-call" convention that allows reasonable use of Lisp in
145 the out of line implementation of VOPs, and by having a "bugout" mechanism that
146 allows the VM to call Lisp functions to implement the hard cases in some VOPs.
148 The VM is designed to support conventional, untagged, general register
149 architectures. Suitably lobotomized, it could be mapped to less flexible
150 hardware such as "Lisp machines", but the compiler would have serious
151 difficulties supporting stack architectures.
153 The VM does not support concurrent lightweight processes. Locking primitives
154 and deep-binding of specials would be needed.
156 The VM does not deal with operating systems interface issues at all. A minimal
157 port would require implementing at least file and terminal I/O streams. \ccl
158 implements system interfaces using Aliens and other facilities built on top of
167 Specific virtual operations implemented by the VM (VOPs). VOPs are primarily
168 the concern of the compiler, since it translates Lisp code into VOPs and then
169 translates VOPs into the implementation.
171 Sub-primitives that are used by Lisp code needing to perform operations
172 below the Lisp level. The compiler implements some sub-primitives directly
173 using VOPs, while others are translated into Lisp code. Sub-primitives provide
174 a layer of insulation between the Lisp system code and the VM, since the Lisp
175 code may assume the existence of operations that are not implemented directly
176 by the VM. Only sub-primitives with fairly portable semantics are documented
177 here. Others are in implementation-specific VM documentation.
180 \comment<Not all sub-primitives are VOPs, and most VOPs are not sub-primitives.>
184 \subsection{VOP base name rules}
186 The names of VOPs that implement functions are based on the function name.
187 Other VOPs may use any base that doesn't conflict with a function name. There
188 are some rules used to obtain the base name for related operations.
190 To get the name of a setting operation, replace the string "{\tt ref}" in the name
191 with "{\tt set}". If "{\tt ref}" doesn't appear in the name, add the prefix "{\tt set-}" to the
192 base name. For example, {\tt svref} becomes {\tt svset}, and {\tt symbol-value}
193 becomes {\tt set-symbol-value}.
195 To get the name of a conditional VOP from the name of a predicate, add the
196 prefix "{\tt if-}" to the predicate name. For example, {\tt eq} becomes {\tt if-eq}.
197 {\tt eq} by itself would be a VOP that returned true or false value.
199 Some operations check for some error condition, magically signalling the error
200 through an implicit control transfer. These operations are prefixed with
201 "{\tt check-}", as in {\tt check-fixnum} and {\tt check-bound}.
205 \subsection{VOP name prefixes and suffixes}
207 Prefixes and suffixes are added to the base to get the names of variant
208 versions of the VOP. The fully general VOP name looks like this:
210 {"{\tt small-}" | "{\tt fast-}"} {\it name}{"{\tt -c}" {\it info}}{"{\tt /}" {\it type}{"{\tt =>}" {\it result-type}}
212 The "{\tt small-}" and "{\tt fast-}" prefixes indicates that the VOP does minimal
213 safety checking and is optimized for space or speed, respectively. The absence
214 of a prefix indicates the safest (or only) version. Usually if the "{\tt small-}"
215 VOP exists, it will be a synonym for either the fast version or the safe
216 version, depending on which is smaller.
218 The "{\tt -c}" suffix indicates that the some info that is passed as a normal
219 argument to the base version of the VOP is passed as Codegen-Info in this
220 version. A typical use would be for VOPs where it is important to use a
221 different version when one of the arguments is a compile time constant.
222 {\it info} is some (possibly null) string that indicates which "{\tt -c}" variant
225 The "{\tt /}{\it type}" suffix asserts that all operands that could be of {\it type} are.
226 For example, {\tt +/fixnum} adds two fixnums returning a fixnum, while
227 {\tt length/simple-vector} finds the length of a simple vector, but the result isn't
230 The "{\tt =>}{\it result-type}" suffix supplies a result type assertion on the
233 A not totally silly example of all these modifiers simultaneously is
234 {\tt fast-+-c/fixnum=>integer}. This operation would this operation adds two
235 fixnums, one of which is a constant passed as codegen info, resulting in an
236 integer. The implementation is optimized for speed at the expense of space and
241 \chapter{Data Types and Storage Resources}
244 \section{Lisp Objects}
247 A Lisp object is fixed-size data structure that is organized in a way mandated
248 by the VM implementation. The fixed format allows the VM to determine the type
249 of the object. \comment<Virtual type? VM type? Implementation type?
250 ...provides the VM enough information about the type of the object for the VM
251 to implement the VM-level semantics... ...supports the "dynamic types"...>
253 Lisp objects are stored in locations known as cells.
256 Has major types: immediate and non-immediate.
257 Non-immediate objects may have a subtype.
259 symbol (nil may be weird)
268 environment (always has subtype)
273 stack closure (control stack pointer)
275 Non-immediate objects are allocated in "type spaces". The type space of an
276 object is characterized by a small integer known as the type code. Any two
277 objects of one of the above boxed types will always have the same type code.
278 {But not really... Some types might be allocated in different type spaces at
279 different times. (?)}
281 The type code doesn't totally describe the object. In general, subtype
282 information may be involved.
295 We consider control transfer to be the fundamental result of comparison, rather
296 than anything such as a condition code. Although most compilers with whizzy
297 register allocation seem to explicitly allocate and manipulate the condition
298 codes, it seems that any benefit is small in our case. This is partly because
299 our VOPs are at a somewhat higher level, making it difficult to tell which VOPs
300 do and don't trash the the CC. Explicitly incorporating condition codes in our
301 VM also introduces another architecture dependency.
303 At the IR2 level, we have a class of IF-XXX VOPs which transfer control to one
304 of two places on the basis of some test on the operands. When generating code
305 for a predicate, we peek at the destination IF node to find where to transfer
308 The exact representation of type tests in IR2 will be fairly implementation
309 dependent, since it will depend on the specific type system for the given
310 implementation. For example, if an implementation can test some types with a
311 simple tag check, but other types require reading a field from the object in
312 addition, then the two different kinds of checks should be distinct at the VOP
313 level, since this will allow the VOP cost and storage information to be more
314 accurate. Generation of type tests should be factored out of code which would
315 otherwise be more portable. Probably the IR2 translator for TYPEP and the type
316 check generation code are the only places that should know about how type tests
317 are represented in IR2.
321 If-Type Tests whether Object has the type code that is passed in the
322 codegen info. If-Type-Range tests for a range of type codes.
324 {small, fast} if-vector-type (object)
325 Test that Object is either of the specified type code, or is a 1d array
326 header with data having the specified type code.
328 if-vector-subtype (object)
329 Test the subtype field of a vector-like object. It is assumed that the
330 object has already been determined to be vector-like.
335 The rationale behind having these as separate VOPs is that they are likely
336 to be immediate types, and thus may have bizzare type schemes.
340 We have distinct operations for these predicates since one or the other
341 isn't a simple tag test, but we don't know which one.
343 if-rationalp (object)
349 The rationale behind having these operations is that they may take a lot of
350 code, so it is reasonable to put them out of line.
354 \section{Type Sub-primitives}
356 change-type (object) => result
357 Change the type of an object according to codegen info. The meaning of
358 this is highly type-system dependent, but it doesn't matter, since the
359 compiler will never emit this VOP directly. The only way that it can show
360 up is through %Primitive.
366 Boxed and unboxed locations:
367 Non-immediate objects may not be stored in unboxed locations.
368 Things not lisp objects may not be stored in boxed locations.
370 Control stack is boxed.
371 Optional number stack is unboxed.
372 Heap environment is boxed.
373 Fixed number of registers, some boxed and some unboxed.
375 PCs may be stored on the control stack or in boxed registers, subject to the
376 constraint that a corresponding environment is also stored. Locations
377 containing PCs don't need to be zeroed when they are no longer used; nothing
378 bad will happen if an old PC is unaccompanied by an environment.
381 \item[Trap]Illegal object trap. This value is used in symbols to signify an
382 undefined value or definition.
388 Character is an immediate type. Characters are manipulated primarily by
389 converting into an integer and accessing these fields:
391 \item[{\tt %character-code-byte}]The character code. This is effectively required to
392 start at bit 0, since \cl equates {\tt char-int} to {\tt char-code} when there is
393 no bits or font. All current \ccl systems use ASCII for the character codes,
394 and define {\tt \#\newline} to be a linefeed, but system code should not count on
397 \item[{\tt %character-control-byte}]The character bits. Character bits are used by
398 Hemlock to describe modifiers in keyboard events, but there is no assumption of
399 any general portable significance of character bits.
401 {\tt %character-font-byte}\\The character font. This is not used by \ccl, and is
402 not particularly useful.
405 Characters should be converted to and from integers by using the \clisp
406 {\tt char-int} and {\tt int-char} functions, which the compiler translates into
409 char-int (char) => int
410 int-char (int) => char
412 In the common case where Char is known to be a {\tt string-char}, these
413 operations are equivalent to {\tt char-code} and {\tt code-char}. In addition to
414 providing a portable interface to character conversion, the VOP representation
415 of this type conversion allows the compiler to avoid unnecessary boxing and
416 unboxing of character objects.
418 Existing code explicitly converts fixnums to characters by using the
419 Make-Immediate-Type sub-primitive with %Character-Type. Currently conversion
420 of characters to fixnums is rather confused. Originally, characters were a
421 subtype of the Misc type code, and the result of the Make-Fixnum sub-primitive
422 had to be masked with {\tt %character-int-mask}; some code still does this, while
425 Character comparisons could be implemented by doing numeric comparisons on the
426 result of {\tt char-int}, or by using {\tt eq} in the case of {\tt char=}, but this
427 can result in unnecessary type conversions. Instead, the compiler uses these
439 Symbols are currently fairly boring in \ccl, containing only the obvious slots:
441 {\tt %symbol-value-slot}\\The current dynamic value of this symbol. If the
442 symbol is currently unbound, then the value of this slot is the unbound marker.
444 {\tt %symbol-function-slot}\\The global function function definition of this
445 symbol. If the symbol is not fbound, then this slot holds the unbound marker.
448 {\tt %symbol-plist-slot} \*
449 {\tt %symbol-name-slot} \*
450 {\tt %symbol-package-slot}
451 }\\The property list, print name and package for this symbol.
456 \section{Sub-primitives}
458 The {\tt alloc-symbol} sub-primitive allocates a new symbol object. {\it name} is
459 the simple-string that is to be the name of the symbol.
461 alloc-symbol (name) => symbol
464 The {\tt set-symbol-package} sub-primitive is used by system code that must set
467 set-symbol-package (symbol new-value)
472 \section{Accessor VOPs}
474 These VOPs read the global symbol value and definition cells. {\tt constant-ref}
475 may only be used on symbols that have been defined to be constants. Since a
476 constant cannot change in value and cannot be dynamically bound, the compiler
477 may be able to compile uses of {\tt constant-ref} more efficiently. Unsafe
478 versions of these VOPs may not check for the slot being unbound, which the
479 corresponding \clisp functions are required to do.
481 {small, fast} symbol-value (symbol) => value
482 {small, fast} constant-ref (symbol) => value
483 {small, fast} symbol-function (symbol) => value
486 These VOPs set the global symbol value and definition cells. {\tt makunbound}
487 and {\tt fmakunbound} are implemented by setting the value to the unbound marker.
489 {small, fast} set-symbol-value (symbol new-value)
490 {small, fast} set-symbol-function (symbol new-value)
493 The \clisp accessors for other symbol slots are translated into uses of the
494 {\tt slot-ref} and {\tt slot-set} VOPs.
498 \section{Special Binding}
500 These VOPs implement dynamic binding of special variables using shallow
501 binding. {\tt bind} binds {\it symbol} to the specified {\it value}, while
502 {\tt unbind} undoes the most recent {\it count} special bindings on the binding
510 \section{Property Lists}
512 The {\tt get} VOP implements the corresponding \clisp function, while {\tt put}
513 implements its setf-inverse.
515 get (symbol indicator default) => value
516 put (symbol indicator value)
525 list<n> (elt0 ... elt<n-1>) => list
526 list (elt0 ... elt<n-1> more-elts) => list
527 For some small N, we have fixed-arg versions of List. For larger lists, we
528 pass in additional elements in a stack TN (possibly required to be on stack
529 top). List* is similar.
532 These VOPs implement the corresponding \clisp functions:
534 {small, fast} car (list) => value
535 {small, fast} cdr (list) => value
538 These VOPs set the car or cdr of a cons:
540 {small, fast} set-car (cons new-value)
541 {small, fast} set-cdr (cons new-value)
544 These VOPs implement the \clisp {\tt assoc} and {\tt member} functions with test
545 functions of {\tt eql} and {\tt eq}:
547 assoc (item alist) => cons-or-nil
548 assq (item alist) => cons-or-nil
549 member (item list) => cons-or-nil
550 memq (item list) => cons-or-nil
554 {\tt getf} implements the corresponding \clisp function, while {\tt putf} is used
555 to implement its setf-inverse. {\tt putf} returns the new value for the list so
556 that it may stored back into the place.
558 getf (list indicator default) => value
559 putf (list indicator new-value) => list
565 \index{Fixnum format}
566 Fixnum\\An N-bit two's complement integer.
568 \index{Short float format}
569 Short-Float\\An immediate float format.
571 \index{Bignum format}
573 Bignum\\Bignums are infinite-precision integers, represented somehow.
575 \index{Flonum format}
576 \index{Floating point formats}
577 Floats\\Floats are stored as consecutive words of bits.
580 Ratio\\Ratios are stored as two consecutive words of Lisp objects, which should
583 \index{Complex number format}
584 Complex\\Complex numbers are stored as two consecutive words of Lisp objects,
585 which should both be numbers.
589 \section{Number VOPs}
592 {small, fast} integer-length/fixnum
601 {small, fast} decode-float/xxx-float
603 {small, fast} scale-float/xxx-float
606 {small, fast} if-=/fixnum
607 {small, fast} if-=/xxx-float
608 Do numeric comparison of X and Y. The codegen-info contains the
609 continuations to transfer to in the true and false cases. Same for <, >.
612 {small, fast} +/fixnum
613 {small, fast} +/fixnum=>integer
614 {small, fast} +/xxx-float
615 Same for -, *. Fixnum multiplication by a constant power of 2 (or near
616 power of 2) can be done by a transform.
619 {small, fast} //xxx-float
622 {small, fast} negate/fixnum
623 {small, fast} negate/fixnum=>integer
624 {small, fast} negate/xxx-float
627 truncate (x y) => q r
628 {small, fast} truncate/fixnum
631 {small, fast} logand/fixnum
632 Ditto for logior, logxor.
635 {small, fast} lognot/fixnum
638 {small, fast} ash/fixnum
639 {small, fast} ash-c/fixnum
645 These will only be used as a last resort. There should be transforms that
646 turn fixnum operations with constant byte-specifiers into standard logical
650 \section{Number Sub-primitives}
664 \cl arrays can be represented in a few different ways in \rtccl --
665 different representations have different performance advantages. Simple
666 general vectors, simple vectors of integers, and simple strings are basic \rtccl
667 data types, and access to these structures is quicker than access to
668 non-simple (or ``complex'') arrays. However, all multi-dimensional arrays in
669 \rtccl are complex arrays, so references to these are always through a
673 Once a vector has been allocated, it is possible to reduce its length by using
674 the Shrink-Vector sub-primitive, but never to increase its length, even back to
675 the original size, since the space freed by the reduction may have been
684 An array header is identical in form to a G-Vector. At present, the following
685 subtype codes are defined:
686 \begin{itemize, spread 0, spacing 1}
688 1 Array is displaced to another array (which may be simple).
690 The entries in the header-vector are interpreted as follows:
692 \index{Array header format}
694 0 Data Vector \\This is a pointer to the I-Vector, G-Vector, or string that
695 contains the actual data of the array. In a multi-dimensional array, the
696 supplied indices are converted into a single 1-D index which is used to access
697 the data vector in the usual way. If the array is displaced, then this is
698 the array displaced to, which may be an array header. In general, array
699 access must loop until it finds an actual data vector.
701 1 Number of Elements \\This is a fixnum indicating the number of elements for
702 which there is space in the data vector.
704 2 Fill Pointer \\This is a fixnum indicating how many elements of the data
705 vector are actually considered to be in use. Normally this is initialized to
706 the same value as the Number of Elements field, but in some array applications
707 it will be given a smaller value. Any access beyond the fill pointer is
710 3 Displacement \\This fixnum value is added to the final code-vector index
711 after the index arithmetic is done but before the access occurs. Used for
712 mapping a portion of one array into another. For most arrays, this is 0.
714 4 Range of First Index \\This is the number of index values along the first
715 dimension, or one greater than the largest legal value of this index (since the
716 arrays are always zero-based). A fixnum in the range 0 to 2\+{24}-1. If any
717 of the indices has a range of 0, the array is legal but will contain no data
718 and accesses to it will always be out of range. In a 0-dimension array, this
719 entry will not be present.
721 5 - N Ranges of Subsequent Dimensions
724 The number of dimensions of an array can be determined by looking at the length
725 of the array header. The rank will be this number minus 6. The maximum array
726 rank is 65535 - 6, or 65529.
728 The ranges of all indices are checked on every access, during the conversion to
729 a single data-vector index. In this conversion, each index is added to the
730 accumulating total, then the total is multiplied by the range of the following
731 dimension, the next index is added in, and so on. In other words, if the data
732 vector is scanned linearly, the last array index is the one that varies most
733 rapidly, then the index before it, and so on.
743 Initialized and uninitialized versions?
746 length (sequence) => size
747 {small, fast} length/vector
748 {small, fast} length/simple-vector
749 {small, fast} length/simple-string
750 {small, fast} length/simple-bit-vector
752 aref1 (vector index) => value
753 {small, fast} aref1/simple-vector
754 {small, fast} aref1/simple-string
755 {small, fast} aref1/simple-bit-vector
756 {small, fast} aref1/simple-array-XXX-float
758 aset1 (vector index new-value)
759 {small, fast} aset1/simple-vector
760 {small, fast} aset1/simple-string
761 {small, fast} aset1/simple-bit-vector
762 {small, fast} aset1/simple-array-XXX-float
764 {small, fast} aref1/simple-array-unsigned-byte (vector index) => value
765 {small, fast} aset1/simple-array-unsigned-byte (vector index new-value)
766 Byte size is codegen info.
768 aref<N> (array index0 ... index<n-1>) => value
769 aset<N> (array index0 ... index<n-1> new-value)
770 For some small value of N. Of course, higher dimensional arrays can also
771 be specialized in seven different ways.... Multi-dimensional simple array
772 reference with known dimensions can be open-coded using a transform (useful
777 \section{Array Sub-primitives}
783 set-vector-access-code
789 header-length (header) => size
790 header-ref (header index) => value
791 header-set (header index new-value)
795 {reverse-}find-character
796 {reverse-}find-character-with-attribute
797 {reverse-}string-compare
799 sxhash-simple-substring
804 {small, fast} structure-ref (s) => value
805 {small, fast} structure-set (s new-value)
806 Read and write structure slots. Defstruct slot description is in codegen
812 \chapter{Runtime Environment}
813 \index{Runtime Environment}
817 \section{Register Allocation}
818 \index{Register allocation}
820 The main idea is to globally allocate only those registers with global
823 We permanently dedicate the CONT register to point to the current control stack
824 environment. This is the "frame pointer" in standard terminology. It isn't
825 possible to get pack to allocate this register on an as-needed basis due to the
826 classic phase-ordering problem. We need to know if TNs are allocated on the
827 stack before we can determine tell how badly we need a frame pointer register.
828 This is of little significance with the control stack environment, since we
829 almost always need one, and if there are any stack TNs, we must allocate the
830 frame pointer in a register, since there is nowhere else to put it. The
831 problem is more severe with a number stack environment pointer. We can't
832 dedicate a register to it, since we usually don't have any TNs on the number
833 stack. The only easy solution is to always allocate the number stack
834 environment pointer on the control stack. This really isn't too bad, when you
835 compare the cost of doing an extra memory reference to get at the number stack
836 to the cost of number-consing.
838 We also dedicate the ENV register to the current constant pool. It would be
839 possible to explicitly allocate the constant pointer as needed if we explicitly
840 represented non-immediate constant access by a VOP, but this would be extra
841 work, and there are major advantages to representing all constants using TNs.
842 Another potential efficiency advantage is since the same constant pool is
843 shared by all the code in a component, we need only initialize ENV on entry to
844 the component. When we make local calls, we don't have to do anything to make
845 the constants available to the callee.
847 Since the constant pool will also contain the code vector and the debug info,
848 having it in a known place may make life easier for GC and the debugger. We
849 may not be able to count on it too much, though, since ENV holds other things
850 will calls are in progress, and might be pretty random if we jumped into
856 CONT: the current control stack context.
857 PC is assumed to be accessible to the debugger when an error happens.
858 Current-Catch: pointer to the current catch frame. Format of frame is assumed.
859 Current-Unwind-Protect: current unwind protect frame. Similar to catch.
861 If shallow-bind, binding stack and binding stack pointer.
862 If deep-bind, current special binding. Format of binding frame assumed.
864 Everything depends on the current environment, which is CONT.
875 \section{Other Dynamic State}
877 There are some dynamic state variables that are stored in known memory
878 locations, rather than having a dedicated register:
880 binding stack pointer\\The current pointer to the top of the binding stack.
882 current catch\\The pointer to the current catch block.
884 current unwind-protect\\The pointer to the current unwind-protect block.
890 \section{Control-Stack Format}
891 \label{Control-Stack-Format}
892 \index{Control-stack format}
895 The control stack contains only Lisp objects. Every object pointed to by an
896 entry on this stack is kept alive.
898 The \rtccl control stack does not have a rigid frame structure. The compiler
899 is allowed a large amount of freedom in the use of the stack so that it choose
900 the best calling sequences. Mostly the compiler is the only system that cares
901 how the stack is laid out, so this isn't a big problem. See chapter
902 \ref{debug-info} for a description of the structures which allow the debugger
907 \section{Values Passing Conventions}
910 The first {\it nregs} arguments are passed in registers, where nregs is an
911 implementation dependent constant. Any additional arguments are the block of
912 storage between CONT and CS on the control stack. The first nregs locations in
913 this block of storage are unused so that register more-args can be stored on
914 the stack without having to BLT the stack values up.
916 Returning unknown values are passed in a similar way, but the stack values
917 block is between OLD-CONT and CS. There isn't any underneath the values: on
918 return OLD-CONT is always what CS was when the function was called. The
919 function returned to must copy the values into the desired location in its
920 frame and deallocate excess stuff on the top of the stack.
922 More args are represented by a pointer to the block of values and a count. The
923 function that originally created the more arg must allocate and deallocate this
924 stuff somehow. In the case of a local call to a more arg entry, we can just
925 allocate it as a TN. The external entry point for a more arg entry is more
930 The caller allocates the environment for the called function, stores the
931 arguments into it, and jumps to the function. The caller makes the called
932 environment current, passing in the return OLD-CONT and PC as explicit arguments.
934 When returning values, the returner directly stores the return values into the
935 frame being returned to. This works even though the caller doesn't know what
936 function it is returning to, since the same return locations are allocated in
939 In a tail-recursive call, we can destructively modify the current frame and
940 jump right to the callee, rather than allocating a new frame. We can do this
941 because TNBind globally allocates frame locations; all frames are the same size
942 and have the same TNs in the same place.
945 \section{Binding-Stack Format}
946 \index{Binding stack format}
947 \comment<In a symbol chapter?>
950 The special binding stack is used to hold previous values of special variables
951 that have been bound. It grows and shrinks with the depth of the binding
952 environment, as reflected in the control stack. This stack contains
953 symbol-value pairs, with only boxed Lisp objects present.
955 Each entry of the binding-stack consists of two boxed (32-bit) words. Pushed
956 first is a pointer to the symbol being bound. Pushed second is the symbol's
957 old value (any boxed item) that is to be restored when the binding stack is
963 Function calling is a way of life.
965 every function is a closure. pointer to current closure is passed in ENV
966 unless it isn't (in local call may be elsewhere).
968 The description of the representation of functions and the function calling
969 conventions is a large part of the VM description, since:
970 Function calling is one of the most complicated facilities provided by the
973 Everything that happens, happens in a function, so all parts of the system
974 tend to get dragged in.
977 Aspects of function call:
979 Environment CONT, ENV
980 Argument/value passing
981 Argument/value count dispatching
986 \section{Function Object Format}
989 The old notion of a "function object" is now broken down into four different
992 Function entry\\A function entry is a structure that holds the information
993 that we need to call a function. This is the user visible function object.
995 Environment\\The environment is stuff that a function needs when it runs.
996 This includes constants computed at load time and variables closed over at run
997 time. Environment information may be allocated in the function entry structure
998 after the required linkage information.
1000 Entry information\\This is information about a specific function entry that is
1001 occasionally referenced at run time, but need not be immediately accessible.
1002 Entry information will be either allocated in the function entry
1003 or in the environment that it points to.
1005 Debug information\\This is information about a function that isn't normally
1006 needed at run time. Debug information can be found by poking around in
1007 environment objects.
1009 See chapter \ref{control-conventions} for a description of how function objects
1013 \section{Environment Object Sub-primitives}
1019 \subsection{Debug Information Location}
1021 If present, debug information is stored immediately following any fixed
1022 information in the environment object. It may be necessary to chain up
1023 multiple levels of environments to find the debug information. The debug
1024 information can be recognized because it is represented by a defstruct
1025 structure. See chapter \ref{debug-info} for a description of the debug
1029 \section{Function Calls}
1030 \index{function call}
1032 \ccl supports three major calling conventions. The convention used
1033 depends on the amount of information available at compile time:
1035 Local\\Local call is used when the call and the called function are
1036 compiled at the same time. Using the term "convention" to describe this
1037 call mechanism is somewhat of a misnomer, since the compiler can do
1040 Named\\Named call is used when the call is to a global function whose name
1041 is known at compile time.
1043 Anonymous\\Anonymous call is used when the function called is unknown until
1050 Environment manipulation code is always emitted at the location of the Bind or
1051 Return node for a Lambda.
1053 Implicit args to functions in IR2:
1054 old-cont: cont to restore on return
1055 return-pc: pc to return to
1056 env: pointer to current closure (if heap)
1057 closure<n>: closed values for current closure (if stack)
1059 Other info needed for IR2 conversion of functions:
1060 base pointers for all heap closures consed by this function
1061 also have passing locs for each explicit arg
1062 return strategy (known or unknown) and return locs
1064 All arguments including implicit ones must have both a passing TN and a
1065 permanent TN. Passing locs for let calls can be the actual TN that holds the
1066 variable in the case of local variables. Set closure variables must still have
1067 a separate passing TN.
1069 If we know the values counts for the argument continuations, then we compile
1070 local mv-calls by moving the TNs for the values continuations into the argument
1071 passing locations. Other mv-calls must be compiled using various hairy
1072 stack-hacking VOPs and unknown argument count call VOPs.
1074 For now, we will create the callee's frame just before the call, instead of
1075 creating it before the evaluation of the first argument. If we created the
1076 environment early, then we would be able to move the argument values directly
1077 into the frame, instead of having to store them somewhere else for a while.
1078 The problem with early creation is that lifetime analysis gets confused because
1079 there is more than one instance of the same TN present simultaneously in the
1080 case where there are nested calls to the same function.
1082 It turns out that there isn't a problem with a simple self-call, because the TN
1083 in the called frame is really the "same" TN as the one in the current frame,
1084 due to the restricted way in which we use the passing TNs.
1086 We emit code for external entry points during IR2 conversion. The external
1087 entry point is the place where we start running in a full call from a
1088 function-entry. It does arg count checking and dispatching, moves the
1089 arguments into the passing locations for the for the lambda being called, and
1090 calls the lambda, moving the results into the standard locations if there
1091 aren't there already.
1095 In IR2, the environment manipulation semantics of function call are decoupled
1096 from the control semantics. When allocating closure variables for a Let, it is
1097 possible to do environment manipulation with only the normal sequential control
1098 flow. In the case of a Let call with the same environment, we neither
1099 manipulate the environment nor transfer control; we merely initialize the
1100 variables with Move VOPs.
1102 If a local function returns a known number of values which is less than the
1103 number expected by the caller, then additional code must be inserted at the
1104 return site which sets the unused values to NIL.
1106 The full function call mechanism must effectively be a subset of the local call
1107 mechanism, since the two mechanisms must mesh at entry points and full function
1108 calls. A full call turns into some kind of full call VOP. There are different
1109 VOPs for calling named functions and closures. We also have tail-recursive
1110 full call VOPs. Arguments are set up using Move VOPs, just as for local call.
1111 The only difference is that the passing locations and conventions are
1112 restricted to the standard ones.
1114 The gory details of arg count checking and dispatching are buried in the
1115 Function-Entry VOP, which takes a functional and a list of continuations, one
1116 pointing to each external entry.
1119 \subsection{Local Call}
1122 Named and anonymous call are called full calls, to distinguish them from
1123 local call. When making full calls, the compiler must make many worst-case
1124 assumptions that aren't necessary in a local call. The advantage of local
1125 call is that the compiler can choose to use only those parts of the full
1126 call sequence that are actually necessary.
1128 In local call, we always know the function being called, so we never have
1129 to do argument count checking, and can always use an immediate branch for
1130 the control transfer. If the function doesn't return to more than one
1131 place, then can just use a simple branch, or even drop through.
1133 The argument passing TNs may be allocated anywhere. The caller allocates the
1134 stack frame for the called function, moving any non-register arguments into the
1135 passing locations in the callee's frame.
1137 If we are calling a local function that doesn't always return the same
1138 number of values, then we must use the same values returning mechanism that
1139 is used in full call, but we don't have to use the standard registers.
1141 A tail-recursive local call doesn't require any call VOP. We just use Move
1142 VOPs to put the arguments into the passing locations and then jump to the the
1143 start of the code for the function. We don't have to do any stack hackery
1144 since we use the same stack frame format for all the functions compiled at the
1145 same time. In many cases tail-recursive local calls can be entirely optimized
1146 away, since they involve only some moves and a branch. We preference the
1147 argument values to the passing locations of the called function, making it
1148 likely that no move will be necessary. Often the control transfer can be done
1149 by simply dropping through.
1151 We have to do some funny stuff with local calls in order to get the lifetimes
1152 for the passing locations right, since lifetime analysis skips directly from
1153 the return point to the call point, ignoring the uses of the passing locations
1154 in the called function. Similarly, we pretend that a block ending in a return
1157 call-local (arg*) "fun" => value
1158 multiple-call-local (arg*) "fun" => start end val0 ... val<n>
1159 Call-Local is used for calls to local functions that are forced to use the
1160 unknown-values passing convention. Value is the first return value
1161 register; we don't really do anything to it, but we specify it as a result
1162 to represent the assignment done by the calling function.
1164 Multiple-Call-Local is similar, but specifies all the values used by the
1165 unknown-values convention. Default-Values may be used to receive a
1166 specific number of values.
1168 known-call-local (arg*) "fun" => value*
1169 This VOP is used for local calls to functions where we can determine at
1170 compile time that the number of values returned is always the same. In
1171 this case, we don't need to indicate the number of values, and can pass
1172 them in separate TNs. The Values are the actual return locations. We
1173 don't really do anything to the return values; we just specify them as
1174 results to represent the assignment done by the called function.
1176 known-return (return-pc value*) "fun"
1177 This VOP is used for returning from local calls using the known return
1178 values convention. The specified return Values are moved into the passing
1179 locations in the caller's frame.
1182 If we know that the function we are calling is non-recursive, then we can
1183 compile it much like a tail-recursive call. We must have a call VOP to compute
1184 the return PC, but we don't need to allocate a frame or save registers. We
1185 just set up the arguments in the frame and do the call.
1187 We require simple functions to use the known-values convention. It would be
1188 possible to support unknown values, but it would potentially require BLT'ing
1189 return values out of the frame and on to the top of the stack. Supporting
1190 unknown values would also require a bunch more VOPs, since we need different
1191 call and return VOPs for simple call.
1193 Known values return causes no problems, since the callee knows how many values
1194 are wanted. We store the values directly into the current frame, since it is
1195 also the caller's frame.
1197 known-call-simple () "fun" => return-pc
1198 known-return-simple (return-pc) "fun"
1199 Similar to the non-simple VOPs, but don't allocate or deallocate frames,
1200 and assume that argument and value passing is done with explicit Move VOPs.
1203 \subsection{Full Call}
1206 Both named and anonymous call are optimized for calls where the number of
1207 arguments is known at compile time. Unknown argument calls are a
1208 pathological case of anonymous call; this case will be ignored in the main
1209 discussion. The difference between named and anonymous calls is in the
1210 argument count dispatching mechanism.
1212 Named call allows an arbitrary number of entry points, with start PCs at
1213 arbitrary locations in the code vector. The link-table mechanism described
1214 below allows named calls to jump directly to the actual entry point without any
1215 run-time argument count or type checking checking.
1217 Anonymous call has a fixed number of entry points, with start PCs at fixed
1218 locations in the code vector. This allows calls to be made without knowing
1219 what function is being called, but has more run-time overhead. The object
1220 called must be checked to be a valid function-entry object. The entry PC must
1221 be computed from the function entry, and argument count checking must be done
1222 if there are more than three required or optional arguments.
1224 Argument passing in full call is conceptually similar to local call, but the
1225 caller can't allocate the entire frame for the callee, since it doesn't know
1226 how much stack is needed. Instead we allocate the frame in two parts. The
1227 caller only allocates the beginning of the frame, which contains the stack
1228 arguments in fixed locations. We leave the first <n> locations unused so that
1229 the called function can move register more args onto the stack without having
1230 to BLT down any stack arguments.
1232 The place in the code where a full call jumps in is called an external entry
1233 point. The external entry point allocates the rest of the stack frame and then
1234 does a local call to the actual entry-point function, fetching the arguments
1235 from the standard passing locations. Usually we can do a tail-recursive local
1238 There are two main cases where the call from the external entry point cannot be
1240 -- It is desirable to use the known-values convention for calling the
1241 entry-point function if the entry-point is used in other local calls
1242 (perhaps because of recursion). In this case, the called function stores
1243 the return values back into the frame allocated by the external entry point
1244 and then returns back to it. The external entry point must then return
1245 these values using the standard unknown-values convention.
1246 -- In a more-arg entry point we don't know how many stack arguments there are
1247 at the beginning of the frame, so we can't really use the frame allocated
1248 by the external entry point at all. Instead we do a local call to the
1249 more-arg entry point, passing in a pointer to the first extra value. When
1250 the function returns, we deallocate the crap on the stack and then return
1251 the values. It is still o.k. to use the known-values return convention
1252 from the more-arg entry since the extra arg values are no longer needed by
1253 the time the returning function stores the return values back into the
1254 external entry point frame.
1257 In full call we must always use the unknown-values convention for return. The
1258 first <n> values are passed in the standard argument registers. The Old-Cont
1259 register holds the Start of the values block and SP points to the End.
1262 {small, fast} call (function arg0 ... arg<n>) "nargs" => value
1263 {small, fast} call-named (arg0 ... arg<n>) "nargs" "name" => value
1264 Call-Closure calls Function with the specified register arguments,
1265 returning the first value as the result. "nargs" is the total number of
1266 arguments passed. Only the register arguments actually passed should be
1267 specified as operands.
1269 Call-Named is similar, but calls a global function specified at compile
1272 {small, fast} tail-call (function pc arg0 ... arg<n>) "nargs"
1273 {small, fast} tail-call-named (pc arg0 ... arg<n>) "nargs" "name"
1274 Similar to the standard call VOPs, but passes PC as the return PC, rather
1275 than returning to the call site. These VOPs have no results since they
1278 {small, fast} multiple-call (function arg0 ... arg<n>) "nargs"
1279 => start end val0 ... val<n>
1280 {small, fast} multiple-call-named (arg0 ... arg<n>) "nargs" "name"
1281 => start end val0 ... val<n>
1282 These VOPs are similar to the standard call VOPs, but allow any number of
1283 values to be received by returning all the value passing registers as
1284 results. A specific number of values may be received by using
1287 call-unknown (function count arg0 ... arg<n>) => start end val0 ... val<n>
1288 tail-call-unknown (function pc count arg0 ... arg<n>)
1289 Call a function with an unknown number of arguments. Used for apply and
1290 hairy multiple-value-call.
1292 Function-Entry () "function" => env return-pc old-cont arg*
1293 This marks the place where we jump into a component for an external
1294 entry point. It represents whatever magic is necessary to do argument
1295 count checking and dispatching. The external entry points for each
1296 argument count will be successors of the entry-vector block (might be in
1297 the same block if only one?)
1299 Function-Entry also represents argument passing by specifying the actual
1300 external passing locations as results, thus marking the beginning of their
1301 lifetimes. All passing locations actually used by any entry point are
1302 specified as Args, including stack arguments.
1303 {\#\#\# Do we really need this? If we do, then we probably also need similar
1304 entry markers for local functions. The lifetimes don't really need to be
1305 explicitly bounded, since an entry point is effectively "the end of the
1309 \section(Returning from a Function Call)
1314 return (return-pc value)
1315 multiple-return (return-pc start end val0 ... val<n>)
1316 Return Value from the current function, jumping back to the location
1317 specified by Return-PC. {Perhaps allow to return any fixed, known number
1320 Multiple-Return is similar, but allows an arbitrary number of values to be
1321 returned. End - Start is the total number of values returned. Start
1322 points to the beginning of the block of return values, but the first <n>
1323 values val0 ... val<n> are actually returned in registers.
1325 default-values (start end val0 ... val<n>) => val0 ... val<j>
1326 This VOP is used when we want to receive exactly J values. If fewer than J
1327 values were supplied, then missing values are defaulted to NIL. As a
1328 side-effect, this VOP pops off any returned stack values.
1331 \section{Saving and Restoring Registers}
1333 We use a caller-saves convention. The caller explicitly emits saving and
1334 restoring code. Tail-recursive calls don't need
1335 any register saving since we never come back.
1339 \chapter{Non-local exits}
1342 \subsection{Unwind Blocks}
1344 \index{Catch frames}
1346 There is one aspect of the control stack format that is fixed, and which
1347 concerns us at this level. This is the format of the "frames" which mark the
1348 destination of non-local exits, such as for BLOCK and CATCH. These frames are
1349 collectively known as unwind blocks. The basic unwind block is used for
1350 lexical exists such as BLOCK, and for UNWIND-PROTECT. Its format is the
1353 0 Pointer to current unwind-protect.
1354 1 Control stack context to restore on entry.
1358 The unwind block for CATCH is identical except for additional cells
1359 containing the catch tag and previous catch.
1361 0 Pointer to current unwind-protect.
1362 1 Control stack context to restore on entry.
1368 The conventions used to manipulate unwind blocks are described in chapter
1369 \ref{Control-Conventions}.
1373 \section{Non-Local Exits}
1378 \index{Unwind-Protect}
1379 \index{Non-Local Exits}
1381 In the normal flow of control, each function that is called executes until it
1382 reaches a return point; under these conditions no special effort is needed to
1383 restore the environment as long as each function undoes any change that it
1384 makes to the dynamic state before it returns. When we make a non-local
1385 transfer, we skip a potentially arbitrary collection of these cleanup actions.
1386 Since we cannot in general know what changes have been made to the dynamic
1387 environment below us on the stack, we must restore a snapshot of the dynamic
1388 environment at the re-entry point.
1390 We represent the closed continuation by the pointer to the unwind-block for the
1391 reentry point. At the exit point, we just pass this stack pointer to the
1392 Unwind VOP, which deals with processing any unwind-protects. When Unwind is
1393 done, it grabs the re-entry PC out of the location at the stack pointer and
1396 Catch and Unwind-Protect work in pretty much the same way. We make a stack TN
1397 to hold the catch frame or whatever, allocate TNs in them to represent the
1398 slots, and then initialize them. The frame can be explicitly linked in by TN
1399 manipulations, since the active catch and whatnot are represented by TNs.
1400 Since allocation of the frame is decoupled from linking and unlinking, some of
1401 this stuff could be moved out of loops. We will need a VOP for loading the PC
1402 for an arbitrary continuation so that we can set up the reentry PC. This can
1403 be done using the Call VOP. Using a call instruction is probably a good way to
1404 get a PC on most architectures anyway.
1406 These TNs are allocated by Pack like any others; we use special alloc and
1407 dealloc VOPs to delimit the aggregate lifetimes.
1409 In the non-local case, the the Block, Catch and Unwind-Protect special forms
1410 are implemented using unwind blocks. The unwind blocks are built by move
1411 operations emitted inline by the compiler. The compiler adds and removes
1412 catches and unwind protects by explicit moves to the locations that hold the
1413 current catch and unwind protect blocks. The entry PC is loaded using the Call
1416 The Unwind miscop is the basis non-local exits. It takes the address of an
1417 unwind block and processes unwind-protects until the current unwind-protect is
1418 the one recorded in the unwind block, then jumps in at the entry in the unwind
1419 block. The entry for the unwind block is responsible for restoring any state
1420 other than the current unwind-protect.
1422 Unwind is used directly to implement non-local Return-From. The address of the
1423 unwind block is stored in a closure variable.
1425 Catch just does a scan up the chain of Catch blocks, starting at the current
1426 catch. When it finds the right one, it calls unwind on it.
1428 Unwind-protects are represented by unwind blocks linked into the current
1429 unwind-protect chain. The cleanup code is entered just like any other any
1430 other unwind entry. As before, the entry is responsible for establishing the
1431 correct dynamic environment for the cleanup code. The target unwind block is
1432 passed in some non-argument register. When the cleanup code is done, it
1433 just calls Unwind with the block passed in. The cleanup code must be careful
1434 not to trash the argument registers or CS, since there may be multiple values
1437 With Catch/Throw, we always use the variable values return value passing convention,
1438 since we don't know how many values the catch wants. With Block/Return-From,
1439 we can do whatever we want, since the returner and receiver know each other.
1441 If a Block or Catch receives stack values, it must call a VOP that BLT's the
1442 values down the stack, squeezing out any intermediate crud.
1447 Unwind does a non-local exit, unwinding to the place indicated by Context.
1448 Context is a pointer to a block of storage allocated on the control stack,
1449 containing the entry PC, current environment and current unwind-protect.
1450 We scan up the stack, processing unwind-protects until we reach the entry
1451 point. The values being returned are passed in the standard locations.
1452 Throw is similar, but does a dynamic lookup for the Tag to determine what
1453 context to unwind to.