+++ /dev/null
-% -*- Dictionary: design; Package: C -*-
-
-May be worth having a byte-code representation for interpreted code. This way,
-an entire system could be compiled into byte-code for debugging (the
-"check-out" compiler?).
-
-Given our current inclination for using a stack machine to interpret IR1, it
-would be straightforward to layer a byte-code interpreter on top of this.
-
-
-Interpreter:
-
-Instead of having no interpreter, or a more-or-less conventional interpreter,
-or byte-code interpreter, how about directly executing IR1?
-
-We run through the IR1 passes, possibly skipping optional ones, until we get
-through environment analysis. Then we run a post-pass that annotates IR1 with
-information about where values are kept, i.e. the stack slot.
-
-We can lazily convert functions by having FUNCTION make an interpreted function
-object that holds the code (really a closure over the interpreter). The first
-time that we try to call the function, we do the conversion and processing.
-Also, we can easily keep track of which interpreted functions we have expanded
-macros in, so that macro redefinition automatically invalidates the old
-expansion, causing lazy reconversion.
-
-Probably the interpreter will want to represent MVs by a recognizable structure
-that is always heap-allocated. This way, we can punt the stack issues involved
-in trying to spread MVs. So a continuation value can always be kept in a
-single cell.
-
-The compiler can have some special frobs for making the interpreter efficient,
-such as a call operation that extracts arguments from the stack
-slots designated by a continuation list. Perhaps
- (values-mapcar fun . lists)
-<==>
- (values-list (mapcar fun . lists))
-This would be used with MV-CALL.
-
-
-This scheme seems to provide nearly all of the advantages of both the compiler
-and conventional interpretation. The only significant disadvantage with
-respect to a conventional interpreter is that there is the one-time overhead of
-conversion, but doing this lazily should make this quite acceptable.
-
-With respect to a conventional interpreter, we have major advantages:
- + Full syntax checking: safety comparable to compiled code.
- + Semantics similar to compiled code due to code sharing. Similar diagnostic
- messages, etc. Reduction of error-prone code duplication.
- + Potential for full type checking according to declarations (would require
- running IR1 optimize?)
- + Simplifies debugger interface, since interpreted code can look more like
- compiled code: source paths, edit definition, etc.
-
-For all non-run-time symbol annotations (anything other than SYMBOL-FUNCTION
-and SYMBOL-VALUE), we use the compiler's global database. MACRO-FUNCTION will
-use INFO, rather than vice-versa.
-
-When doing the IR1 phases for the interpreter, we probably want to suppress
-optimizations that change user-visible function calls:
- -- Don't do local call conversion of any named functions (even lexical ones).
- This is so that a call will appear on the stack that looks like the call in
- the original source. The keyword and optional argument transformations
- done by local call mangle things quite a bit. Also, note local-call
- converting prevents unreferenced arguments from being deleted, which is
- another non-obvious transformation.
- -- Don't run source-transforms, IR1 transforms and IR1 optimizers. This way,
- TRACE and BACKTRACE will show calls with the original arguments, rather
- than the "optimized" form, etc. Also, for the interpreter it will
- actually be faster to call the original function (which is compiled) than
- to "inline expand" it. Also, this allows implementation-dependent
- transforms to expand into %PRIMITIVE uses.
-
-There are some problems with stepping, due to our non-syntactic IR1
-representation. The source path information is the key that makes this
-conceivable. We can skip over the stepping of a subform by quietly evaluating
-nodes whose source path lies within the form being skipped.
-
-One problem with determining what value has been returned by a form. With a
-function call, it is theoretically possible to precisely determine this, since
-if we complete evaluation of the arguments, then we arrive at the Combination
-node whose value is synonymous with the value of the form. We can even detect
-this case, since the Node-Source will be EQ to the form. And we can also
-detect when we unwind out of the evaluation, since we will leave the form
-without having ever reached this node.
-
-But with macros and special-forms, there is no node whose value is the value of
-the form, and no node whose source is the macro call or special form. We can
-still detect when we leave the form, but we can't be sure whether this was a
-normal evaluation result or an explicit RETURN-FROM.
-
-But does this really matter? It seems that we can print the value returned (if
-any), then just print the next form to step. In the rare case where we did
-unwind, the user should be able to figure it out.
-
-[We can look at this as a side-effect of CPS: there isn't any difference
-between a "normal" return and a non-local one.]
-
-[Note that in any control transfer (normal or otherwise), the stepper may need
-to unwind out of an arbitrary number of levels of stepping. This is because a
-form in a TR position may yield its to a node arbitrarily far our.]
-
-Another problem is with deciding what form is being stepped. When we start
-evaluating a node, we dive into code that is nested somewhere down inside that
-form. So we actually have to do a loop of asking questions before we do any
-evaluation. But what do we ask about?
-
-If we ask about the outermost enclosing form that is a subform of the the last
-form that the user said to execute, then we might offer a form that isn't
-really evaluated, such as a LET binding list.
-
-But once again, is this really a problem? It is certainly different from a
-conventional stepper, but a pretty good argument could be made that it is
-superior. Haven't you ever wanted to skip the evaluation of all the
-LET bindings, but not the body? Wouldn't it be useful to be able to skip the
-DO step forms?
-
-All of this assumes that nobody ever wants to step through the guts of a
-macroexpansion. This seems reasonable, since steppers are for weenies, and
-weenies don't define macros (hence don't debug them). But there are probably
-some weenies who don't know that they shouldn't be writing macros.
-
-We could handle this by finding the "source paths" in the expansion of each
-macro by sticking some special frob in the source path marking the place where
-the expansion happened. When we hit code again that is in the source, then we
-revert to the normal source path. Something along these lines might be a good
-idea anyway (for compiler error messages, for example).
-
-The source path hack isn't guaranteed to work quite so well in generated code,
-though, since macros return stuff that isn't freshly consed. But we could
-probably arrange to win as long as any given expansion doesn't return two EQ
-forms.
-
-It might be nice to have a command that skipped stepping of the form, but
-printed the results of each outermost enclosed evaluated subform, i.e. if you
-used this on the DO step-list, it would print the result of each new-value
-form. I think this is implementable. I guess what you would do is print each
-value delivered to a DEST whose source form is the current or an enclosing
-form. Along with the value, you would print the source form for the node that
-is computing the value.
-
-The stepper can also have a "back" command that "unskips" or "unsteps". This
-would allow the evaluation of forms that are pure (modulo lexical variable
-setting) to be undone. This is useful, since in stepping it is common that you
-skip a form that you shouldn't have, or get confused and want to restart at
-some earlier point.
-
-What we would do is remember the current node and the values of all local
-variables. heap before doing each step or skip action. We can then back up
-the state of all lexical variables and the "program counter". To make this
-work right with set closure variables, we would copy the cell's value, rather
-than the value cell itself.
-
-[To be fair, note that this could easily be done with our current interpreter:
-the stepper could copy the environment alists.]
-
-We can't back up the "program counter" when a control transfer leaves the
-current function, since this state is implicitly represented in the
-interpreter's state, and is discarded when we exit. We probably want to ask
-for confirmation before leaving the function to give users a chance to "unskip"
-the forms in a TR position.
-
-Another question is whether the conventional stepper is really a good thing to
-imitate... How about an editor-based mouse-driven interface? Instead of
-"skipping" and "stepping", you would just designate the next form that you
-wanted to stop at. Instead of displaying return values, you replace the source
-text with the printed representation of the value.
-
-It would show the "program counter" by highlighting the *innermost* form that
-we are about to evaluate, i.e. the source form for the node that we are stopped
-at. It would probably also be useful to display the start of the form that was
-used to designate the next stopping point, although I guess this could be
-implied by the mouse position.
-
-
-Such an interface would be a little harder to implement than a dumb stepper,
-but it would be much easier to use. [It would be impossible for an evalhook
-stepper to do this.]
-
-
-%PRIMITIVE usage:
-
-Note: %PRIMITIVE can only be used in compiled code. It is a trapdoor into the
-compiler, not a general syntax for accessing "sub-primitives". It's main use
-is in implementation-dependent compiler transforms. It saves us the effort of
-defining a "phony function" (that is not really defined), and also allows
-direct communication with the code generator through codegen-info arguments.
-
-Some primitives may be exported from the VM so that %PRIMITIVE can be used to
-make it explicit that an escape routine or interpreter stub is assuming an
-operation is implemented by the compiler.