From: William Harold Newman Date: Thu, 9 Aug 2001 13:41:40 +0000 (+0000) Subject: 0.pre7.7: X-Git-Url: http://repo.macrolet.net/gitweb/?a=commitdiff_plain;h=d3ad760954643743cf83f06283ef5afcc1ed388b;p=sbcl.git 0.pre7.7: deleted old CMU CL documentation (so now when I screw up my working copy and have to "cvs checkout" again, it goes faster:-) --- diff --git a/doc/README b/doc/README index 7024f2f..bfb2f08 100644 --- a/doc/README +++ b/doc/README @@ -1,8 +1,12 @@ -SBCL is -- ahem! -- not particularly well documented at this point. +SBCL is -- ahem! -- not completely documented at this point. What can I say? Help with documentation might not be refused.:-) -The old CMUCL documentation, in the cmucl/ subdirectory, is still -somewhat useful. The old user's manual is very useful. Most of the -CMUCL extensions to Common Lisp have gone away, but the general -information about how to use the Python compiler is still very -relevant. +There is a user manual in DocBook format, in user-manual.sgml. +It's based on the CMU CL user manual, and some of its chapters +aren't done, just notes that "this is similar to chapter such-and-such +of the CMU CL user manual". + +The old CMU CL documentation can still be useful both for +missing chapters of the user manual and for documentation of +the internals of the system. It's available from SourceForge +by anonymous ftp. diff --git a/doc/cmucl/cmu-user/cmu-user.dict b/doc/cmucl/cmu-user/cmu-user.dict deleted file mode 100644 index ce86160..0000000 --- a/doc/cmucl/cmu-user/cmu-user.dict +++ /dev/null @@ -1,460 +0,0 @@ -'BAR -VARREF -'TEST -UPCASE -ENDLISP -SUBSEQ -ENDDEFUN -FUNARGS -GENSYM -VARS -UNINTERNED -VAR -VSOURCE -CLISP -COND -MYSTUFF -TRADEOFFS -PATHNAME -LLISP -CMUCL -REF -YETMOREKEYS -CLEANUP -ARGS -DEFUN -ZOQ -FOO -'S -CLTL -MACROEXPANDS -MACROEXPANSION -PROXY -ERRORFUL -EQ -ECASE -PYTHON -DEFMACRO -PROMISCUOUS -FLAMAGE -DEBUGGABILITY -FEATUREFULNESS -DEBUGGABLE -ENDDEFVAR -MACROEXPANDED -DEFVAR -ENDDEFMAC -KWD -MGROUP -MSTAR -DEFMAC -OFFS -NOTINLINE -TRADEOFF -FUNCALL -SOMEVAL -SOMEFUN -CM -DEFTYPE -CONSING -FIXNUMS -BIGNUMS -FROB -'FOO -RECOMPILES -FTYPE -TYPECASE -TYPEP -UNTYPED -UNIONED -GLOBALS -MODICUM -MACREF -SLEAZING -ES -STEELE -ETYPECASE -'EQL -'IDENTITY -'FUN -LOCALFUN -ISQRT -ODDP -MYFUN -POS -ZOW -YOW -'YOW -CADR -ZEROP -RES -EXPT -PARED -PUSHING -'ING -RPLACD -IOTA -NTHCDR -NTH -CADDDR -RPLACA -CADDR -FIENDS -SQRT -'SQRT -LISPY -BLANKSPACE -MYCHAPTER -UNENCAPSULATED -ENCAPSULATIONS -UNENCAPSULATE -UNTRACED -UNTRACE -EVALED -SPEC -PUSHES -TRUENAME -MYMAC -UNINFORMATIVE -FOOBAR -BAZ -BACKQUOTE -MALFORMED -MOREKEYS -FUNREF -QUIRKS -UNDILUTED -DISASSEMBLY -NAN -DENORMALIZED -ENDDEFCONST -DEFCONST -HASHTABLES -EFF -OBFUSCATING -SNOC -GRUE -GORP -FLO -NUM -VEC -MULTBY -SOMEOTHERFUN -'CHAR -NOTP -TESTP -FUNVAR -RAZ -ZUG -XFF -IO -GC'ING -EXT -MEGABYTE -SYS -UX -ED -MATCHMAKER -DIRED -PCL -CLOS -CONFORMANCE -ENDDEFCON -DEFCON -DECLAIM -DEFSTRUCT -ENUM -EXTERN -LOWERCASING -DEREFERENCED -MOPT -STRUCT -DEFTP -ENDDEFTP -MALLOC -CSH -PXLREF -ATYPE -CONSTRUCTUED -ANAME -PXREF -ENV -ONECOLUMN -TP -VR -FN -PRINTINDEX -UNNUMBERED -TWOCOLUMN -TLF -UNCOMPILED -DEACTIVATE -CALLABLE -UNREFERENCED -SUPPLIEDP -INTERNING -UNHANDLED -BACKTRACING -TEX -OOB -OBJ -PRIN -OBJS -GP -LINKERS -CC -AR -CFUN -INTS -SIZEOF -PRINTF -CFOO -SUBFORM -SVREF -STASH -FOOS -LC -LD -'N -'X -ERRNO -UPPERCASING -EXPR -ADDR -'STR -STR -DEREF -PTR -SWINDOW -IWINDOW -'SLIDER -DRAWABLE -'KEY -'EXT -TIMEOUTS -'MY -ID -PIXMAPS -'EQ -FUNCALLED -XWINDOW -'IH -SIGSTOP -GETPID -SIGTSTP -SCP -SIGINT -IH -CNT -GENERALRETURN -DEFMACX -'NUKEGARBAGE -GR -HASSLE -PREPENDS -TIMEOUT -FD -MSG -SYSCALL -UNHELPFUL -PREPENDED -VM -PAGEREF -INT -PORTSID -PORTSNAME -SERVPORT -KERN -DATATYPES -TTY -STDERR -STDOUT -STDIN -CMD -AUX -PS -UNACCOUNTED -RUNTIMES -PROFILER -UNPROFILE -REPROFILED -UNPROFILED -CF -ELT -VOPS -MAPCAR -OPTIONALS -CONSES -CONTORTIONS -ALISTS -ALIST -ASSOC -EXP -MYEXP -DEFCONSTANT -INCF -MEMQ -COERCIONS -EQL -LOGAND -AREF -CONSP -TYPEN -LOGIOR -EQUIV -SUPERTYPE -DEFMETHOD -SUBFORMS -CERROR -PSETQ -TAGBODY -DOTIMES -PLOQ -ROQ -SPECS -MPLUS -STEPPER -FDEFINITION -FUNCALLABLE -ST -BR -DB -LB -LL -HFILL -PP -VPRINT -TH -ARGLISTS -SETQ -NAMESPACE -SUBFUNCTION -BACKTRACE -'B -FLET -ARG -'A -CPSUBINDEX -PROGN -CONTRIB -WEEKDAYS -GREENWICH -TIMEZONE -DEST -WEEKDAY -JAN -CINDEX -NAMESTRING -PATHNAMES -FASL -SIGSEGV -PLIST -'ABLE -SETF -PID -EXECVE -DEV -SUBPROCESS -PTY -'TH -UNSUPPLIED -DEFVARX -GCS -CONSED -GC'ED -GC -TRASHING -XLIB -CL -HI -COMMONLOOPS -CTRL -XLREF -DEFUNX -DEFCONSTX -SUBSUBSECTION -VINDEXED -TINDEXED -RESEARCHCREDIT -EM -WHOLEY -SKEF -KAUFMANN -TODD -KOLOJEJCHICK -BUSDIECKER -'' -NOINDENT -MOORE -TIM -LOTT -LEINEN -HALLGREN -GLEICHAUF -DUNNING -TED -BADER -MYLISP -NOINIT -FINDEXED -INIT -EVAL -SUBDIRECTORIES -COPYRIGHTED -FTP -LANG -COMP -MEG -MEGABYTES -UNCOMPRESS -CD -OS -USERNAME -SLISP -RT -LIB -SETENV -SAMP -SETPATH -LOGIN -MISC -USR -MODMISC -TXT -DOC -EXECUTABLES -PERQ -UNTAGGED -BENCHMARKING -WINDOWING -INTRO -DOCS -EDU -AFS -VSPACE -IFINFO -DIR -SETFILENAME -TABLEOFCONTENTS -PAGENUMBERING -CLEARPAGE -MAKETITLE -ARPASUPPORT -CITATIONINFO -TRNUMBER -IFTEX -SUNOS -SPARC -DECSTATIONS -THEABSTRACT -DEF -KY -CP -NEWINDEX -ALWAYSREFILL -PAGESTYLE -CMULISP -TITLEPAGE -ELISP -LATEXINFO -DOCUMENTSTYLE diff --git a/doc/cmucl/cmu-user/cmu-user.tex b/doc/cmucl/cmu-user/cmu-user.tex deleted file mode 100644 index 931a3f1..0000000 --- a/doc/cmucl/cmu-user/cmu-user.tex +++ /dev/null @@ -1,13321 +0,0 @@ -%% CMU Common Lisp User's Manual. -%% -%% Aug 97 Raymond Toy -%% This is a modified version of the original CMUCL User's Manual. -%% The key changes are modification of this file to use standard -%% LaTeX2e. This means latexinfo isn't going to work anymore. -%% However, Latex2html support has been added. -%% -%% Jan 1998 Paul Werkowski -%% A few of the packages below are not part of the standard LaTeX2e -%% distribution, and must be obtained from a repository. At this time -%% I was able to fetch from -%% ftp.cdrom.com:pub/tex/ctan/macros/latex/contrib/supported/ -%% camel/index.ins -%% camel/index.dtx -%% calc/calc.ins -%% calc/calc.dtx -%% changebar/changebar.ins -%% changebar/changebar.dtx -%% One runs latex on the .ins file to produce .tex and/or .sty -%% files that must be put in a path searched by latex. -%% -\documentclass{report} -\usepackage{changebar} -\usepackage{xspace} -\usepackage{alltt} -\usepackage{index} -\usepackage{verbatim} -\usepackage{ifthen} -\usepackage{calc} -%\usepackage{html2e} -\usepackage{html,color} -\usepackage{varioref} - -%% Define the indices. We need one for Types, Variables, Functions, -%% and a general concept index. -\makeindex -\newindex{types}{tdx}{tnd}{Type Index} -\newindex{vars}{vdx}{vnd}{Variable Index} -\newindex{funs}{fdx}{fnd}{Function Index} -\newindex{concept}{cdx}{cnd}{Concept Index} - -\newcommand{\tindexed}[1]{\index[types]{#1}\textsf{#1}} -\newcommand{\findexed}[1]{\index[funs]{#1}\textsf{#1}} -\newcommand{\vindexed}[1]{\index[vars]{#1}\textsf{*#1*}} -\newcommand{\cindex}[1]{\index[concept]{#1}} -\newcommand{\cpsubindex}[2]{\index[concept]{#1!#2}} - -%% This code taken from the LaTeX companion. It's meant as a -%% replacement for the description environment. We want one that -%% prints description items in a fixed size box and puts the -%% description itself on the same line or the next depending on the -%% size of the item. -\newcommand{\entrylabel}[1]{\mbox{#1}\hfil} -\newenvironment{entry}{% - \begin{list}{}% - {\renewcommand{\makelabel}{\entrylabel}% - \setlength{\labelwidth}{45pt}% - \setlength{\leftmargin}{\labelwidth+\labelsep}}}% - {\end{list}} - -\newlength{\Mylen} -\newcommand{\Lentrylabel}[1]{% - \settowidth{\Mylen}{#1}% - \ifthenelse{\lengthtest{\Mylen > \labelwidth}}% - {\parbox[b]{\labelwidth}% term > labelwidth - {\makebox[0pt][l]{#1}\\}}% - {#1}% - \hfil\relax} -\newenvironment{Lentry}{% - \renewcommand{\entrylabel}{\Lentrylabel} - \begin{entry}}% - {\end{entry}} - -\newcommand{\fcntype}[1]{\textit{#1}} -\newcommand{\argtype}[1]{\textit{#1}} -\newcommand{\fcnname}[1]{\textsf{#1}} - -\newlength{\formnamelen} % length of a name of a form -\newlength{\pboxargslen} % length of parbox for arguments -\newlength{\typelen} % length of the type label for the form - -\newcommand{\args}[1]{#1} -\newcommand{\keys}[1]{\textsf{\&key} \= #1} -\newcommand{\morekeys}[1]{\\ \> #1} -\newcommand{\yetmorekeys}[1]{\\ \> #1} - -\newcommand{\defunvspace}{\ifhmode\unskip \par\fi\addvspace{18pt plus 12pt minus 6pt}} - - -%% \layout[pkg]{name}{param list}{type} -%% -%% This lays out a entry like so: -%% -%% pkg:name arg1 arg2 [Function] -%% -%% where [Function] is flush right. -%% -\newcommand{\layout}[4][\mbox{}]{% - \par\noindent - \fcnname{#1#2\hspace{1em}}% - \settowidth{\formnamelen}{\fcnname{#1#2\hspace{1em}}}% - \settowidth{\typelen}{[\argtype{#4}]}% - \setlength{\pboxargslen}{\linewidth}% - \addtolength{\pboxargslen}{-1\formnamelen}% - \addtolength{\pboxargslen}{-1\typelen}% - \begin{minipage}[t]{\pboxargslen} - \begin{tabbing} - #3 - \end{tabbing} - \end{minipage} - \hfill[\fcntype{#4}]% - \par\addvspace{2pt plus 2pt minus 2pt}} - -\newcommand{\vrindexbold}[1]{\index[vars]{#1|textbf}} -\newcommand{\fnindexbold}[1]{\index[funs]{#1|textbf}} - -%% Define a new type -%% -%% \begin{deftp}{typeclass}{typename}{args} -%% some description -%% \end{deftp} -\newenvironment{deftp}[3]{% - \par\bigskip\index[types]{#2|textbf}% - \layout{#2}{\var{#3}}{#1} - }{} - -%% Define a function -%% -%% \begin{defun}{pkg}{name}{params} -%% \defunx[pkg]{name}{params} -%% description of function -%% \end{defun} -\newenvironment{defun}[3]{% - \par\defunvspace\fnindexbold{#2}\label{FN:#2}% - \layout[#1]{#2}{#3}{Function} - }{} -\newcommand{\defunx}[3][\mbox{}]{% - \par\fnindexbold{#2}\label{FN:#2}% - \layout[#1]{#2}{#3}{Function}} - -%% Define a macro -%% -%% \begin{defmac}{pkg}{name}{params} -%% \defmacx[pkg]{name}{params} -%% description of macro -%% \end{defmac} -\newenvironment{defmac}[3]{% - \par\defunvspace\fnindexbold{#2}\label{FN:#2}% - \layout[#1]{#2}{#3}{Macro}}{} -\newcommand{\defmacx}[3][\mbox{}]{% - \par\fnindexbold{#2}\label{FN:#2}% - \layout[#1]{#2}{#3}{Function}} - -%% Define a variable -%% -%% \begin{defvar}{pkg}{name} -%% \defvarx[pkg]{name} -%% description of defvar -%% \end{defvar} -\newenvironment{defvar}[2]{% - \par\defunvspace\vrindexbold{#2}\label{VR:#2} - \layout[#1]{*#2*}{}{Variable}}{} -\newcommand{\defvarx}[2][\mbox{}]{% - \par\vrindexbold{#2}\label{VR:#2} - \layout[#1]{*#2*}{}{Variable}} - -%% Define a constant -%% -%% \begin{defconst}{pkg}{name} -%% \ddefconstx[pkg]{name} -%% description of defconst -%% \end{defconst} -\newcommand{\defconstx}[2][\mbox{}]{% - \layout[#1]{#2}{}{Constant}} -\newenvironment{defconst}[2]{% - \defunvspace\defconstx[#1]{#2}} - -\newenvironment{example}{\begin{quote}\begin{alltt}}{\end{alltt}\end{quote}} -\newenvironment{lisp}{\begin{example}}{\end{example}} -\newenvironment{display}{\begin{quote}\begin{alltt}}{\end{alltt}\end{quote}} - -\newcommand{\hide}[1]{} -\newcommand{\trnumber}[1]{#1} -\newcommand{\citationinfo}[1]{#1} -\newcommand{\var}[1]{{\textsf{\textsl{#1}}\xspace}} -\newcommand{\code}[1]{\textnormal{{\sffamily #1}}} -\newcommand{\file}[1]{`\texttt{#1}'} -\newcommand{\samp}[1]{`\texttt{#1}'} -\newcommand{\kwd}[1]{\code{:#1}} -\newcommand{\F}[1]{\code{#1}} -\newcommand{\w}[1]{\hbox{#1}} -\renewcommand{\b}[1]{\textrm{\textbf{#1}}} -\renewcommand{\i}[1]{\textit{#1}} -\newcommand{\ctrl}[1]{$\uparrow$\textsf{#1}} -\newcommand{\result}{$\Rightarrow$} -\newcommand{\myequiv}{$\equiv$} -\newcommand{\back}[1]{\(\backslash\)#1} -\newcommand{\pxlref}[1]{see section~\ref{#1}, page~\pageref{#1}} -\newcommand{\xlref}[1]{See section~\ref{#1}, page~\pageref{#1}} - -\newcommand{\false}{\textsf{nil}} -\newcommand{\true}{\textsf{t}} -\newcommand{\nil}{\textsf{nil}} -\newcommand{\FALSE}{\textsf{nil}} -\newcommand{\TRUE}{\textsf{t}} -\newcommand{\NIL}{\textsf{nil}} - -\newcommand{\ampoptional}{\textsf{\&optional}} -\newcommand{\amprest}{\textsf{\&rest}} -\newcommand{\ampbody}{\textsf{\&body}} -\newcommand{\mopt}[1]{{$\,\{$}\textnormal{\textsf{\textsl{#1\/}}}{$\}\,$}} -\newcommand{\mstar}[1]{{$\,\{$}\textnormal{\textsf{\textsl{#1\/}}}{$\}^*\,$}} -\newcommand{\mplus}[1]{{$\,\{$}\textnormal{\textsf{\textsl{#1\/}}}{$\}^+\,$}} -\newcommand{\mgroup}[1]{{$\,\{$}\textnormal{\textsf{\textsl{#1\/}}}{$\}\,$}} -\newcommand{\mor}{$|$} - -\newcommand{\funref}[1]{\findexed{#1} (page~\pageref{FN:#1})} -\newcommand{\specref}[1]{\findexed{#1} (page~\pageref{FN:#1})} -\newcommand{\macref}[1]{\findexed{#1} (page~\pageref{FN:#1})} -\newcommand{\varref}[1]{\vindexed{#1} (page~\pageref{VR:#1})} -\newcommand{\conref}[1]{\conindexed{#1} (page~\pageref{VR:#1})} - -%% Some common abbreviations -\newcommand{\clisp}{Common Lisp} -\newcommand{\dash}{---} -\newcommand{\alien}{Alien} -\newcommand{\aliens}{Aliens} -\newcommand{\Aliens}{Aliens} -\newcommand{\Alien}{Alien} -\newcommand{\Hemlock}{Hemlock} -\newcommand{\hemlock}{Hemlock} -\newcommand{\python}{Python} -\newcommand{\Python}{Python} -\newcommand{\cmucl}{CMU Common Lisp} -\newcommand{\llisp}{Common Lisp} -\newcommand{\Llisp}{Common Lisp} -\newcommand{\cltl}{\emph{Common Lisp: The Language}} -\newcommand{\cltltwo}{\emph{Common Lisp: The Language 2}} - -%% Replacement commands when we run latex2html. This should be last -%% so that latex2html uses these commands instead of the LaTeX -%% commands above. -\begin{htmlonly} - \usepackage{makeidx} - - \newcommand{\var}[1]{\textnormal{\textit{#1}}} - \newcommand{\code}[1]{\textnormal{\texttt{#1}}} - %%\newcommand{\printindex}[1][\mbox{}]{} - - %% We need the quote environment because the alltt is broken. The - %% quote environment helps us in postprocessing to result to get - %% what we want. - \newenvironment{example}{\begin{quote}\begin{alltt}}{\end{alltt}\end{quote}} - \newenvironment{display}{\begin{quote}\begin{alltt}}{\end{alltt}\end{quote}} - - \newcommand{\textnormal}[1]{\rm #1} - \newcommand{\hbox}[1]{\mbox{#1}} - \newcommand{\xspace}{} - \newcommand{newindex}[4]{} - - \newcommand{\pxlref}[1]{see section~\ref{#1}} - \newcommand{\xlref}[1]{See section~\ref{#1}} - - \newcommand{\tindexed}[1]{\index{#1}\texttt{#1}} - \newcommand{\findexed}[1]{\index{#1}\texttt{#1}} - \newcommand{\vindexed}[1]{\index{#1}\texttt{*#1*}} - \newcommand{\cindex}[1]{\index{#1}} - \newcommand{\cpsubindex}[2]{\index{#1!#2}} - - \newcommand{\keys}[1]{\texttt{\&key} #1} - \newcommand{\morekeys}[1]{#1} - \newcommand{\yetmorekeys}[1]{#1} - - \newenvironment{defun}[3]{% - \textbf{[Function]}\\ - \texttt{#1#2} \emph{#3}\\}{} - \newcommand{\defunx}[3][\mbox{}]{% - \texttt{#1#2} {\em #3}\\} - \newenvironment{defmac}[3]{% - \textbf{[Macro]}\\ - \texttt{#1#2} \emph{#3}\\}{} - \newcommand{\defmacx}[3][\mbox{}]{% - \texttt{#1#2} {\em #3}\\} - \newenvironment{defvar}[2]{% - \textbf{[Variable]}\\ - \texttt{#1*#2*}\\ \\}{} - \newcommand{\defvarx}[2][\mbox{}]{% - \texttt{#1*#2*}\\} - \newenvironment{defconst}[2]{% - \textbf{[Constant]}\\ - \texttt{#1#2}\\}{} - \newcommand{\defconstx}[2][\mbox{}]{\texttt{#1#2}\\} - \newenvironment{deftp}[3]{% - \textbf{[#1]}\\ - \texttt{#2} \textit{#3}\\}{} - \newenvironment{Lentry}{\begin{description}}{\end{description}} -\end{htmlonly} - -%% Set up margins -\setlength{\oddsidemargin}{-10pt} -\setlength{\evensidemargin}{-10pt} -\setlength{\topmargin}{-40pt} -\setlength{\headheight}{12pt} -\setlength{\headsep}{25pt} -\setlength{\footskip}{30pt} -\setlength{\textheight}{9.25in} -\setlength{\textwidth}{6.75in} -\setlength{\columnsep}{0.375in} -\setlength{\columnseprule}{0pt} - - -\setcounter{tocdepth}{2} -\setcounter{secnumdepth}{3} -\def\textfraction{.1} -\def\bottomfraction{.9} % was .3 -\def\topfraction{.9} - -\pagestyle{headings} - -\begin{document} -%%\alwaysrefill -\relax -%%\newindex{cp} -%%\newindex{ky} - -\newcommand{\theabstract}{% - - CMU Common Lisp is an implementation of that Common Lisp runs on - various Unix workstations. See the README file in the distribution - for current platforms. The largest single part of this document - describes the Python compiler and the programming styles and - techniques that the compiler encourages. The rest of the document - describes extensions and the implementation dependent choices made - in developing this implementation of Common Lisp. We have added - several extensions, including a source level debugger, an interface - to Unix system calls, a foreign function call interface, support for - interprocess communication and remote procedure call, and other - features that provide a good environment for developing Lisp code. - } - -\newcommand{\researchcredit}{% - This research was sponsored by the Defense Advanced Research - Projects Agency, Information Science and Technology Office, under - the title \emph{Research on Parallel Computing} issued by DARPA/CMO - under Contract MDA972-90-C-0035 ARPA Order No. 7330. - - The views and conclusions contained in this document are those of - the authors and should not be interpreted as representing the - official policies, either expressed or implied, of the Defense - Advanced Research Projects Agency or the U.S. government.} - -\pagestyle{empty} -\title{CMU Common Lisp User's Manual} - -%%\author{Robert A. MacLachlan, \var{Editor}} -%%\date{July 1992} -%%\trnumber{CMU-CS-92-161} -%%\citationinfo{ -%%\begin{center} -%%Supersedes Technical Reports CMU-CS-87-156 and CMU-CS-91-108. -%%\end{center} -%%} -%%%%\arpasupport{strategic} -%%\abstract{\theabstract} -%%%%\keywords{lisp, Common Lisp, manual, compiler, -%%%% programming language implementation, programming environment} - -%%\maketitle -\begin{latexonly} - -%% \title{CMU Common Lisp User's Manual} - - \author{Robert A. MacLachlan, - \emph{Editor}% - \thanks{\small This research was sponsored by the Defense Advanced - Research Projects Agency, Information Science and Technology - Office, under the title \emph{Research on Parallel Computing} - issued by DARPA/CMO under Contract MDA972-90-C-0035 ARPA Order No. - 7330. The views and conclusions contained in this document are - those of the authors and should not be interpreted as representing - the official policies, either expressed or implied, of the Defense - Advanced Research Projects Agency or the U.S. government.}} - - - -\date{\bigskip - July 1992 \\ CMU-CS-92-161 \\ - \vspace{0.25in} - October 31, 1997 \\ - Net Version \\ - \vspace{0.75in} {\small - School of Computer Science \\ - Carnegie Mellon University \\ - Pittsburgh, PA 15213} \\ - \vspace{0.5in} \small Supersedes Technical Reports CMU-CS-87-156 and - CMU-CS-91-108.\\ - \vspace{0.5in} \textbf{Abstract} \medskip - \begin{quote} - \theabstract - \end{quote} - } - -\maketitle -\end{latexonly} - -%% Nice HTML version of the title page -\begin{rawhtml} - -

CMU Common Lisp User's Manual

- -

Robert A. MacLachlan, Editor -

-

- July 1992
- CMU-CS-92-161
-

-
-

- July 1997
- Net Version
-

- -

- School of Computer Science
- Carnegie Mellon University
- Pittsburgh, PA 15213
-

-
-

- Supersedes Technical Reports CMU-CS-87-156 and - CMU-CS-91-108.
-

- -

- Abstract -

- CMU Common Lisp is an implementation of that Common Lisp runs on - various Unix workstations. See the README file in the - distribution for current platforms. The largest single part of - this document describes the Python compiler and the programming - styles and techniques that the compiler encourages. The rest of - the document describes extensions and the implementation - dependent choices made in developing this implementation of - Common Lisp. We have added several extensions, including a - source level debugger, an interface to Unix system calls, a - foreign function call interface, support for interprocess - communication and remote procedure call, and other features that - provide a good environment for developing Lisp code. -
-

-
- This research was sponsored by the Defense Advanced Research - Projects Agency, Information Science and Technology Office, under - the title Research on Parallel Computing issued by DARPA/CMO - under Contract MDA972-90-C-0035 ARPA Order No. 7330. -

- The views and conclusions contained in this document are those of - the authors and should not be interpreted as representing the - official policies, either expressed or implied, of the Defense - Advanced Research Projects Agency or the U.S. government. -

-
-

-\end{rawhtml} -\clearpage -\vspace*{\fill} -\textbf{Keywords:} lisp, Common Lisp, manual, compiler, -programming language implementation, programming environment -\clearpage -\pagestyle{headings} -\pagenumbering{roman} -\tableofcontents - -\clearpage -\pagenumbering{arabic} -%%\end{iftex} - -%%\setfilename{cmu-user.info} -%%\node Top, Introduction, (dir), (dir) - - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/intro.ms} - - - -\hide{ -*- Dictionary: cmu-user -*- } -\begin{comment} -* Introduction:: -* Design Choices and Extensions:: -* The Debugger:: -* The Compiler:: -* Advanced Compiler Use and Efficiency Hints:: -* UNIX Interface:: -* Event Dispatching with SERVE-EVENT:: -* Alien Objects:: -* Interprocess Communication under LISP:: -* Debugger Programmer's Interface:: -* Function Index:: -* Variable Index:: -* Type Index:: -* Concept Index:: - - --- The Detailed Node Listing --- - -Introduction - -* Support:: -* Local Distribution of CMU Common Lisp:: -* Net Distribution of CMU Common Lisp:: -* Source Availability:: -* Command Line Options:: -* Credits:: - -Design Choices and Extensions - -* Data Types:: -* Default Interrupts for Lisp:: -* Packages:: -* The Editor:: -* Garbage Collection:: -* Describe:: -* The Inspector:: -* Load:: -* The Reader:: -* Running Programs from Lisp:: -* Saving a Core Image:: -* Pathnames:: -* Filesystem Operations:: -* Time Parsing and Formatting:: -* Lisp Library:: - -Data Types - -* Symbols:: -* Integers:: -* Floats:: -* Characters:: -* Array Initialization:: - -Floats - -* IEEE Special Values:: -* Negative Zero:: -* Denormalized Floats:: -* Floating Point Exceptions:: -* Floating Point Rounding Mode:: -* Accessing the Floating Point Modes:: - -The Inspector - -* The Graphical Interface:: -* The TTY Inspector:: - -Running Programs from Lisp - -* Process Accessors:: - -Pathnames - -* Unix Pathnames:: -* Wildcard Pathnames:: -* Logical Pathnames:: -* Search Lists:: -* Predefined Search-Lists:: -* Search-List Operations:: -* Search List Example:: - -Logical Pathnames - -* Search Lists:: -* Search List Example:: - -Search-List Operations - -* Search List Example:: - -Filesystem Operations - -* Wildcard Matching:: -* File Name Completion:: -* Miscellaneous Filesystem Operations:: - -The Debugger - -* Debugger Introduction:: -* The Command Loop:: -* Stack Frames:: -* Variable Access:: -* Source Location Printing:: -* Compiler Policy Control:: -* Exiting Commands:: -* Information Commands:: -* Breakpoint Commands:: -* Function Tracing:: -* Specials:: - -Stack Frames - -* Stack Motion:: -* How Arguments are Printed:: -* Function Names:: -* Funny Frames:: -* Debug Tail Recursion:: -* Unknown Locations and Interrupts:: - -Variable Access - -* Variable Value Availability:: -* Note On Lexical Variable Access:: - -Source Location Printing - -* How the Source is Found:: -* Source Location Availability:: - -Breakpoint Commands - -* Breakpoint Example:: - -Function Tracing - -* Encapsulation Functions:: - -The Compiler - -* Compiler Introduction:: -* Calling the Compiler:: -* Compilation Units:: -* Interpreting Error Messages:: -* Types in Python:: -* Getting Existing Programs to Run:: -* Compiler Policy:: -* Open Coding and Inline Expansion:: - -Compilation Units - -* Undefined Warnings:: - -Interpreting Error Messages - -* The Parts of the Error Message:: -* The Original and Actual Source:: -* The Processing Path:: -* Error Severity:: -* Errors During Macroexpansion:: -* Read Errors:: -* Error Message Parameterization:: - -Types in Python - -* Compile Time Type Errors:: -* Precise Type Checking:: -* Weakened Type Checking:: - -Compiler Policy - -* The Optimize Declaration:: -* The Optimize-Interface Declaration:: - -Advanced Compiler Use and Efficiency Hints - -* Advanced Compiler Introduction:: -* More About Types in Python:: -* Type Inference:: -* Source Optimization:: -* Tail Recursion:: -* Local Call:: -* Block Compilation:: -* Inline Expansion:: -* Byte Coded Compilation:: -* Object Representation:: -* Numbers:: -* General Efficiency Hints:: -* Efficiency Notes:: -* Profiling:: - -Advanced Compiler Introduction - -* Types:: -* Optimization:: -* Function Call:: -* Representation of Objects:: -* Writing Efficient Code:: - -More About Types in Python - -* More Types Meaningful:: -* Canonicalization:: -* Member Types:: -* Union Types:: -* The Empty Type:: -* Function Types:: -* The Values Declaration:: -* Structure Types:: -* The Freeze-Type Declaration:: -* Type Restrictions:: -* Type Style Recommendations:: - -Type Inference - -* Variable Type Inference:: -* Local Function Type Inference:: -* Global Function Type Inference:: -* Operation Specific Type Inference:: -* Dynamic Type Inference:: -* Type Check Optimization:: - -Source Optimization - -* Let Optimization:: -* Constant Folding:: -* Unused Expression Elimination:: -* Control Optimization:: -* Unreachable Code Deletion:: -* Multiple Values Optimization:: -* Source to Source Transformation:: -* Style Recommendations:: - -Tail Recursion - -* Tail Recursion Exceptions:: - -Local Call - -* Self-Recursive Calls:: -* Let Calls:: -* Closures:: -* Local Tail Recursion:: -* Return Values:: - -Block Compilation - -* Block Compilation Semantics:: -* Block Compilation Declarations:: -* Compiler Arguments:: -* Practical Difficulties:: -* Context Declarations:: -* Context Declaration Example:: - -Inline Expansion - -* Inline Expansion Recording:: -* Semi-Inline Expansion:: -* The Maybe-Inline Declaration:: - -Object Representation - -* Think Before You Use a List:: -* Structure Representation:: -* Arrays:: -* Vectors:: -* Bit-Vectors:: -* Hashtables:: - -Numbers - -* Descriptors:: -* Non-Descriptor Representations:: -* Variables:: -* Generic Arithmetic:: -* Fixnums:: -* Word Integers:: -* Floating Point Efficiency:: -* Specialized Arrays:: -* Specialized Structure Slots:: -* Interactions With Local Call:: -* Representation of Characters:: - -General Efficiency Hints - -* Compile Your Code:: -* Avoid Unnecessary Consing:: -* Complex Argument Syntax:: -* Mapping and Iteration:: -* Trace Files and Disassembly:: - -Efficiency Notes - -* Type Uncertainty:: -* Efficiency Notes and Type Checking:: -* Representation Efficiency Notes:: -* Verbosity Control:: - -Profiling - -* Profile Interface:: -* Profiling Techniques:: -* Nested or Recursive Calls:: -* Clock resolution:: -* Profiling overhead:: -* Additional Timing Utilities:: -* A Note on Timing:: -* Benchmarking Techniques:: - -UNIX Interface - -* Reading the Command Line:: -* Lisp Equivalents for C Routines:: -* Type Translations:: -* System Area Pointers:: -* Unix System Calls:: -* File Descriptor Streams:: -* Making Sense of Mach Return Codes:: -* Unix Interrupts:: - -Unix Interrupts - -* Changing Interrupt Handlers:: -* Examples of Signal Handlers:: - -Event Dispatching with SERVE-EVENT - -* Object Sets:: -* The SERVE-EVENT Function:: -* Using SERVE-EVENT with Unix File Descriptors:: -* Using SERVE-EVENT with the CLX Interface to X:: -* A SERVE-EVENT Example:: - -Using SERVE-EVENT with the CLX Interface to X - -* Without Object Sets:: -* With Object Sets:: - -A SERVE-EVENT Example - -* Without Object Sets Example:: -* With Object Sets Example:: - -Alien Objects - -* Introduction to Aliens:: -* Alien Types:: -* Alien Operations:: -* Alien Variables:: -* Alien Data Structure Example:: -* Loading Unix Object Files:: -* Alien Function Calls:: -* Step-by-Step Alien Example:: - -Alien Types - -* Defining Alien Types:: -* Alien Types and Lisp Types:: -* Alien Type Specifiers:: -* The C-Call Package:: - -Alien Operations - -* Alien Access Operations:: -* Alien Coercion Operations:: -* Alien Dynamic Allocation:: - -Alien Variables - -* Local Alien Variables:: -* External Alien Variables:: - -Alien Function Calls - -* alien-funcall:: The alien-funcall Primitive -* def-alien-routine:: The def-alien-routine Macro -* def-alien-routine Example:: -* Calling Lisp from C:: - -Interprocess Communication under LISP - -* The REMOTE Package:: -* The WIRE Package:: -* Out-Of-Band Data:: - -The REMOTE Package - -* Connecting Servers and Clients:: -* Remote Evaluations:: -* Remote Objects:: -* Host Addresses:: - -The WIRE Package - -* Untagged Data:: -* Tagged Data:: -* Making Your Own Wires:: - -Debugger Programmer's Interface - -* DI Exceptional Conditions:: -* Debug-variables:: -* Frames:: -* Debug-functions:: -* Debug-blocks:: -* Breakpoints:: -* Code-locations:: -* Debug-sources:: -* Source Translation Utilities:: - -DI Exceptional Conditions - -* Debug-conditions:: -* Debug-errors:: -\end{comment} - -%%\node Introduction, Design Choices and Extensions, Top, Top -\chapter{Introduction} - -CMU Common Lisp is a public-domain implementation of Common Lisp developed in -the Computer Science Department of Carnegie Mellon University. \cmucl{} runs -on various Unix workstations---see the README file in the distribution for -current platforms. This document describes the implementation based on the -Python compiler. Previous versions of CMU Common Lisp ran on the IBM RT PC -and (when known as Spice Lisp) on the Perq workstation. See \code{man cmucl} -(\file{man/man1/cmucl.1}) for other general information. - -\cmucl{} sources and executables are freely available via anonymous FTP; this -software is ``as is'', and has no warranty of any kind. CMU and the -authors assume no responsibility for the consequences of any use of this -software. See \file{doc/release-notes.txt} for a description of the -state of the release you have. - -\begin{comment} -* Support:: -* Local Distribution of CMU Common Lisp:: -* Net Distribution of CMU Common Lisp:: -* Source Availability:: -* Command Line Options:: -* Credits:: -\end{comment} - -%%\node Support, Local Distribution of CMU Common Lisp, Introduction, Introduction -\section{Support} - -The CMU Common Lisp project is no longer funded, so only minimal support is -being done at CMU. There is a net community of \cmucl{} users and maintainers -who communicate via comp.lang.lisp and the cmucl-bugs@cs.cmu.edu -\begin{changebar} - cmucl-imp@cons.org -\end{changebar} -mailing lists. - -This manual contains only implementation-specific information about -\cmucl. Users will also need a separate manual describing the -\clisp{} standard. \clisp{} was initially defined in \i{Common Lisp: - The Language}, by Guy L. Steele Jr. \clisp{} is now undergoing -standardization by the X3J13 committee of ANSI. The X3J13 spec is not -yet completed, but a number of clarifications and modification have -been approved. We intend that \cmucl{} will eventually adhere to the -X3J13 spec, and we have already implemented many of the changes -approved by X3J13. - -Until the X3J13 standard is completed, the second edition of -\cltltwo{} is probably the best available manual for the language and -for our implementation of it. This book has no official role in the -standardization process, but it does include many of the changes -adopted since the first edition was completed. - -In addition to the language itself, this document describes a number -of useful library modules that run in \cmucl. \hemlock, an Emacs-like -text editor, is included as an integral part of the \cmucl{} -environment. Two documents describe \hemlock{}: the \i{Hemlock User's - Manual}, and the \i{Hemlock Command Implementor's Manual}. - -%%\node Local Distribution of CMU Common Lisp, Net Distribution of CMU Common Lisp, Support, Introduction -\section{Local Distribution of CMU Common Lisp} - -In CMU CS, \cmucl{} should be runnable as \file{/usr/local/bin/cmucl}. -The full binary distribution should appear under -\file{/usr/local/lib/cmucl/}. Note that the first time you run Lisp, -it will take AFS several minutes to copy the image into its local -cache. Subsequent starts will be much faster. - -Or, you can run directly out of the AFS release area (which may be -necessary on SunOS machines). Put this in your \file{.login} shell -script: -\begin{example} -setenv CMUCLLIB "/afs/cs/misc/cmucl/@sys/beta/lib" -setenv PATH \${PATH}:/afs/cs/misc/cmucl/@sys/beta/bin -\end{example} - -If you also set \code{MANPATH} or \code{MPATH} (depending on the Unix) -to point to \file{/usr/local/lib/cmucl/man/}, then `\code{man cmucl}' -will give an introduction to CMU CL and \samp{man lisp} will describe -command line options. For installation notes, see the \file{README} -file in the release area. - -See \file{/usr/local/lib/cmucl/doc} for release notes and -documentation. Hardcopy documentation is available in the document -room. Documentation supplements may be available for recent -additions: see the \file{README} file. - -Send bug reports and questions to \samp{cmucl-bugs@cs.cmu.edu}. If -you send a bug report to \samp{gripe} or \samp{help}, they will just -forward it to this mailing list. - -%%\node Net Distribution of CMU Common Lisp, Source Availability, Local Distribution of CMU Common Lisp, Introduction -\section{Net Distribution of CMU Common Lisp} - -\subsection{CMU Distribution} -Externally, CMU Common Lisp is only available via anonymous FTP. We -don't have the manpower to make tapes. These are our distribution -machines: -\begin{example} -lisp-rt1.slisp.cs.cmu.edu (128.2.217.9) -lisp-rt2.slisp.cs.cmu.edu (128.2.217.10) -\end{example} - -Log in with the user \samp{anonymous} and \samp{username@host} as -password (i.e. your EMAIL address.) When you log in, the current -directory should be set to the \cmucl{} release area. If you have any -trouble with FTP access, please send mail to \samp{slisp@cs.cmu.edu}. - -The release area holds compressed tar files with names of the form: -\begin{example} -\var{version}-\var{machine}_\var{os}.tar.Z -\end{example} -FTP compressed tar archives in binary mode. To extract, \samp{cd} to -the directory that is to be the root of the tree, then type: -\begin{example} -uncompress \var{value}}. For example, to start up -the saved core file mylisp.core use either of the following two -commands: -\begin{example} -\code{lisp -core=mylisp.core -lisp -core mylisp.core} -\end{example} - -%%\node Credits, , Command Line Options, Introduction -\section{Credits} - -Since 1981 many people have contributed to the development of CMU -Common Lisp. The currently active members are: -\begin{display} -Marco Antoniotti -David Axmark -Miles Bader -Casper Dik -Scott Fahlman * (fearless leader) -Paul Gleichauf * -Richard Harris -Joerg-Cyril Hoehl -Chris Hoover -Simon Leinen -Sandra Loosemore -William Lott * -Robert A. Maclachlan * -\end{display} -\noindent -Many people are voluntarily working on improving CMU Common Lisp. ``*'' -means a full-time CMU employee, and ``+'' means a part-time student -employee. A partial listing of significant past contributors follows: -\begin{display} -Tim Moore -Sean Hallgren + -Mike Garland + -Ted Dunning -Rick Busdiecker -Bill Chiles * -John Kolojejchick -Todd Kaufmann + -Dave McDonald * -Skef Wholey * -\end{display} - - -\vspace{2 em} -\researchcredit - -\begin{changebar} - From 1995, development of CMU Common Lisp has been continued by a - group of volunteers. A partial list of volunteers includes the - following - \begin{table}[h] - \begin{center} - \begin{tabular}{ll} - Paul Werkowski & pw@snoopy.mv.com \\ - Peter VanEynde & s950045@uia.ua.ac.be \\ - Marco Antoniotti & marcoxa@PATH.Berkeley.EDU\\ - Martin Cracauer & cracauer@cons.org\\ - Douglas Thomas Crosher & dtc@scrooge.ee.swin.oz.au\\ - Simon Leinen & simon@switch.ch\\ - Rob MacLachlan & ram+@CS.cmu.edu\\ - Raymond Toy & toy@rtp.ericsson.se - \end{tabular} - \end{center} - \end{table} - - In particular Paul Werkowski completed the port for the x86 - architecture for FreeBSD. Peter VanEnyde took the FreeBSD port and - created a Linux version. -\end{changebar} - - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/design.ms} - -\hide{ -*- Dictionary: cmu-user -*- } -%%\node Design Choices and Extensions, The Debugger, Introduction, Top -\chapter{Design Choices and Extensions} - -Several design choices in Common Lisp are left to the individual -implementation, and some essential parts of the programming environment -are left undefined. This chapter discusses the most important design -choices and extensions. - -\begin{comment} -* Data Types:: -* Default Interrupts for Lisp:: -* Packages:: -* The Editor:: -* Garbage Collection:: -* Describe:: -* The Inspector:: -* Load:: -* The Reader:: -* Running Programs from Lisp:: -* Saving a Core Image:: -* Pathnames:: -* Filesystem Operations:: -* Time Parsing and Formatting:: -* Lisp Library:: -\end{comment} - -%%\node Data Types, Default Interrupts for Lisp, Design Choices and Extensions, Design Choices and Extensions -\section{Data Types} - -\begin{comment} -* Symbols:: -* Integers:: -* Floats:: -* Characters:: -* Array Initialization:: -\end{comment} - -%%\node Symbols, Integers, Data Types, Data Types -\subsection{Symbols} - -As in \cltl, all symbols and package names are printed in lower case, as -a user is likely to type them. Internally, they are normally stored -upper case only. - -%%\node Integers, Floats, Symbols, Data Types -\subsection{Integers} - -The \tindexed{fixnum} type is equivalent to \code{(signed-byte 30)}. -Integers outside this range are represented as a \tindexed{bignum} or -a word integer (\pxlref{word-integers}.) Almost all integers that -appear in programs can be represented as a \code{fixnum}, so integer -number consing is rare. - -%%\node Floats, Characters, Integers, Data Types -\subsection{Floats} -\label{ieee-float} - -\cmucl{} supports two floating point formats: \tindexed{single-float} -and \tindexed{double-float}. These are implemented with IEEE single -and double float arithmetic, respectively. \code{short-float} is a -synonym for \code{single-float}, and \code{long-float} is a synonym -for \code{double-float}. The initial value of -\vindexed{read-default-float-format} is \code{single-float}. - -Both \code{single-float} and \code{double-float} are represented with -a pointer descriptor, so float operations can cause number consing. -Number consing is greatly reduced if programs are written to allow the -use of non-descriptor representations (\pxlref{numeric-types}.) - - -\begin{comment} -* IEEE Special Values:: -* Negative Zero:: -* Denormalized Floats:: -* Floating Point Exceptions:: -* Floating Point Rounding Mode:: -* Accessing the Floating Point Modes:: -\end{comment} - -%%\node IEEE Special Values, Negative Zero, Floats, Floats -\subsubsection{IEEE Special Values} - -\cmucl{} supports the IEEE infinity and NaN special values. These -non-numeric values will only be generated when trapping is disabled -for some floating point exception (\pxlref{float-traps}), so users of -the default configuration need not concern themselves with special -values. - -\begin{defconst}{extensions:}{short-float-positive-infinity} - \defconstx[extensions:]{short-float-negative-infinity} - \defconstx[extensions:]{single-float-positive-infinity} - \defconstx[extensions:]{single-float-negative-infinity} - \defconstx[extensions:]{double-float-positive-infinity} - \defconstx[extensions:]{double-float-negative-infinity} - \defconstx[extensions:]{long-float-positive-infinity} - \defconstx[extensions:]{long-float-negative-infinity} - - The values of these constants are the IEEE positive and negative - infinity objects for each float format. -\end{defconst} - -\begin{defun}{extensions:}{float-infinity-p}{\args{\var{x}}} - - This function returns true if \var{x} is an IEEE float infinity (of - either sign.) \var{x} must be a float. -\end{defun} - -\begin{defun}{extensions:}{float-nan-p}{\args{\var{x}}} - \defunx[extensions:]{float-trapping-nan-p}{\args{\var{x}}} - - \code{float-nan-p} returns true if \var{x} is an IEEE NaN (Not A - Number) object. \code{float-trapping-nan-p} returns true only if - \var{x} is a trapping NaN. With either function, \var{x} must be a - float. -\end{defun} - -%%\node Negative Zero, Denormalized Floats, IEEE Special Values, Floats -\subsubsection{Negative Zero} - -The IEEE float format provides for distinct positive and negative -zeros. To test the sign on zero (or any other float), use the -\clisp{} \findexed{float-sign} function. Negative zero prints as -\code{-0.0f0} or \code{-0.0d0}. - -%%\node Denormalized Floats, Floating Point Exceptions, Negative Zero, Floats -\subsubsection{Denormalized Floats} - -\cmucl{} supports IEEE denormalized floats. Denormalized floats -provide a mechanism for gradual underflow. The \clisp{} -\findexed{float-precision} function returns the actual precision of a -denormalized float, which will be less than \findexed{float-digits}. -Note that in order to generate (or even print) denormalized floats, -trapping must be disabled for the underflow exception -(\pxlref{float-traps}.) The \clisp{} -\w{\code{least-positive-}\var{format}-\code{float}} constants are -denormalized. - -\begin{defun}{extensions:}{float-normalized-p}{\args{\var{x}}} - - This function returns true if \var{x} is a denormalized float. - \var{x} must be a float. -\end{defun} - -%%\node Floating Point Exceptions, Floating Point Rounding Mode, Denormalized Floats, Floats -\subsubsection{Floating Point Exceptions} -\label{float-traps} - -The IEEE floating point standard defines several exceptions that occur -when the result of a floating point operation is unclear or -undesirable. Exceptions can be ignored, in which case some default -action is taken, such as returning a special value. When trapping is -enabled for an exception, a error is signalled whenever that exception -occurs. These are the possible floating point exceptions: -\begin{Lentry} - -\item[\kwd{underflow}] This exception occurs when the result of an - operation is too small to be represented as a normalized float in - its format. If trapping is enabled, the - \tindexed{floating-point-underflow} condition is signalled. - Otherwise, the operation results in a denormalized float or zero. - -\item[\kwd{overflow}] This exception occurs when the result of an - operation is too large to be represented as a float in its format. - If trapping is enabled, the \tindexed{floating-point-overflow} - exception is signalled. Otherwise, the operation results in the - appropriate infinity. - -\item[\kwd{inexact}] This exception occurs when the result of a - floating point operation is not exact, i.e. the result was rounded. - If trapping is enabled, the \code{extensions:floating-point-inexact} - condition is signalled. Otherwise, the rounded result is returned. - -\item[\kwd{invalid}] This exception occurs when the result of an - operation is ill-defined, such as \code{\w{(/ 0.0 0.0)}}. If - trapping is enabled, the \code{extensions:floating-point-invalid} - condition is signalled. Otherwise, a quiet NaN is returned. - -\item[\kwd{divide-by-zero}] This exception occurs when a float is - divided by zero. If trapping is enabled, the - \tindexed{divide-by-zero} condition is signalled. Otherwise, the - appropriate infinity is returned. -\end{Lentry} - -%%\node Floating Point Rounding Mode, Accessing the Floating Point Modes, Floating Point Exceptions, Floats -\subsubsection{Floating Point Rounding Mode} -\label{float-rounding-modes} - -IEEE floating point specifies four possible rounding modes: -\begin{Lentry} - -\item[\kwd{nearest}] In this mode, the inexact results are rounded to - the nearer of the two possible result values. If the neither - possibility is nearer, then the even alternative is chosen. This - form of rounding is also called ``round to even'', and is the form - of rounding specified for the \clisp{} \findexed{round} function. - -\item[\kwd{positive-infinity}] This mode rounds inexact results to the - possible value closer to positive infinity. This is analogous to - the \clisp{} \findexed{ceiling} function. - -\item[\kwd{negative-infinity}] This mode rounds inexact results to the - possible value closer to negative infinity. This is analogous to - the \clisp{} \findexed{floor} function. - -\item[\kwd{zero}] This mode rounds inexact results to the possible - value closer to zero. This is analogous to the \clisp{} - \findexed{truncate} function. -\end{Lentry} - -\paragraph{Warning:} - -Although the rounding mode can be changed with -\code{set-floating-point-modes}, use of any value other than the -default (\kwd{nearest}) can cause unusual behavior, since it will -affect rounding done by \llisp{} system code as well as rounding in -user code. In particular, the unary \code{round} function will stop -doing round-to-nearest on floats, and instead do the selected form of -rounding. - -%%\node Accessing the Floating Point Modes, , Floating Point Rounding Mode, Floats -\subsubsection{Accessing the Floating Point Modes} - -These functions can be used to modify or read the floating point modes: - -\begin{defun}{extensions:}{set-floating-point-modes}{% - \keys{\kwd{traps} \kwd{rounding-mode}} - \morekeys{\kwd{fast-mode} \kwd{accrued-exceptions}} - \yetmorekeys{\kwd{current-exceptions}}} - \defunx[extensions:]{get-floating-point-modes}{} - - The keyword arguments to \code{set-floating-point-modes} set various - modes controlling how floating point arithmetic is done: - \begin{Lentry} - - \item[\kwd{traps}] A list of the exception conditions that should - cause traps. Possible exceptions are \kwd{underflow}, - \kwd{overflow}, \kwd{inexact}, \kwd{invalid} and - \kwd{divide-by-zero}. Initially all traps except \kwd{inexact} - are enabled. \xlref{float-traps}. - - \item[\kwd{rounding-mode}] The rounding mode to use when the result - is not exact. Possible values are \kwd{nearest}, - \latex{\kwd{positive\-infinity}}\html{\kwd{positive-infinity}}, - \kwd{negative-infinity} and \kwd{zero}. Initially, the rounding - mode is \kwd{nearest}. See the warning in section - \ref{float-rounding-modes} about use of other rounding modes. - - \item[\kwd{current-exceptions}, \kwd{accrued-exceptions}] Lists of - exception keywords used to set the exception flags. The - \var{current-exceptions} are the exceptions for the previous - operation, so setting it is not very useful. The - \var{accrued-exceptions} are a cumulative record of the exceptions - that occurred since the last time these flags were cleared. - Specifying \code{()} will clear any accrued exceptions. - - \item[\kwd{fast-mode}] Set the hardware's ``fast mode'' flag, if - any. When set, IEEE conformance or debuggability may be impaired. - Some machines may not have this feature, in which case the value - is always \false. No currently supported machines have a fast - mode. - \end{Lentry} - If a keyword argument is not supplied, then the associated state is - not changed. - - \code{get-floating-point-modes} returns a list representing the - state of the floating point modes. The list is in the same format - as the keyword arguments to \code{set-floating-point-modes}, so - \code{apply} could be used with \code{set-floating-point-modes} to - restore the modes in effect at the time of the call to - \code{get-floating-point-modes}. -\end{defun} - -\begin{changebar} -To make handling control of floating-point exceptions, the following -macro is useful. - -\begin{defmac}{ext:}{with-float-traps-masked}{traps \ampbody\ body} - \code{body} is executed with the selected floating-point exceptions - given by \code{traps} masked out (disabled). \code{traps} should be - a list of possible floating-point exceptions that should be ignored. - Possible values are \kwd{underflow}, \kwd{overflow}, \kwd{inexact}, - \kwd{invalid} and \kwd{divide-by-zero}. - - This is equivalent to saving the current traps from - \code{get-floating-point-modes}, setting the floating-point modes to - the desired exceptions, running the \code{body}, and restoring the - saved floating-point modes. The advantage of this macro is that it - causes less consing to occur. - - Some points about the with-float-traps-masked: - - \begin{itemize} - \item Two approaches are available for detecting FP exceptions: - \begin{enumerate} - \item enabling the traps and handling the exceptions - \item disabling the traps and either handling the return values or - checking the accrued exceptions. - \end{enumerate} - Of these the latter is the most portable because on the alpha port - it is not possible to enable some traps at run-time. - - \item To assist the checking of the exceptions within the body any - accrued exceptions matching the given traps are cleared at the - start of the body when the traps are masked. - - \item To allow the macros to be nested these accrued exceptions are - restored at the end of the body to their values at the start of - the body. Thus any exceptions that occurred within the body will - not affect the accrued exceptions outside the macro. - - \item Note that only the given exceptions are restored at the end of - the body so other exception will be visible in the accrued - exceptions outside the body. - - \item On the x86, setting the accrued exceptions of an unmasked - exception would cause a FP trap. The macro behaviour of restoring - the accrued exceptions ensures than if an accrued exception is - initially not flagged and occurs within the body it will be - restored/cleared at the exit of the body and thus not cause a - trap. - - \item On the x86, and, perhaps, the hppa, the FP exceptions may be - delivered at the next FP instruction which requires a FP - \code{wait} instruction (\code{%vm::float-wait}) if using the lisp - conditions to catch trap within a \code{handler-bind}. The - \code{handler-bind} macro does the right thing and inserts a - float-wait (at the end of its body on the x86). The masking and - noting of exceptions is also safe here. - - \item The setting of the FP flags uses the - \code{(floating-point-modes)} and the \code{(set - (floating-point-modes)\ldots)} VOPs. These VOPs blindly update - the flags which may include other state. We assume this state - hasn't changed in between getting and setting the state. For - example, if you used the FP unit between the above calls, the - state may be incorrectly restored! The - \code{with-float-traps-masked} macro keeps the intervening code to - a minimum and uses only integer operations. - %% Safe byte-compiled? - %% Perhaps the VOPs (x86) should be smarter and only update some of - %% the flags, the trap masks and exceptions? - \end{itemize} - -\end{defmac} -\end{changebar} - -%%\node Characters, Array Initialization, Floats, Data Types -\subsection{Characters} - -\cmucl{} implements characters according to \i{Common Lisp: the - Language II}. The main difference from the first version is that -character bits and font have been eliminated, and the names of the -types have been changed. \tindexed{base-character} is the new -equivalent of the old \tindexed{string-char}. In this implementation, -all characters are base characters (there are no extended characters.) -Character codes range between \code{0} and \code{255}, using the ASCII -encoding. -\begin{changebar} - Table~\ref{tbl:chars}~\vpageref{tbl:chars} shows characters - recognized by \cmucl. -\end{changebar} - -\begin{changebar} -\begin{table}[tbhp] - \begin{center} - \begin{tabular}{|c|c|l|l|l|l|} - \hline - \multicolumn{2}{|c|}{ASCII} & \multicolumn{1}{|c}{Lisp} & - \multicolumn{3}{|c|}{} \\ - \cline{1-2} - Name & Code & \multicolumn{1}{|c|}{Name} & \multicolumn{3}{|c|}{\raisebox{1.5ex}{Alternatives}}\\ - \hline - \hline - \code{nul} & 0 & \code{\#\back{NULL}} & \code{\#\back{NUL}} & &\\ - \code{bel} & 7 & \code{\#\back{BELL}} & & &\\ - \code{bs} & 8 & \code{\#\back{BACKSPACE}} & \code{\#\back{BS}} & &\\ - \code{tab} & 9 & \code{\#\back{TAB}} & & &\\ - \code{lf} & 10 & \code{\#\back{NEWLINE}} & \code{\#\back{NL}} & \code{\#\back{LINEFEED}} & \code{\#\back{LF}}\\ - \code{ff} & 11 & \code{\#\back{VT}} & \code{\#\back{PAGE}} & \code{\#\back{FORM}} &\\ - \code{cr} & 13 & \code{\#\back{RETURN}} & \code{\#\back{CR}} & &\\ - \code{esc} & 27 & \code{\#\back{ESCAPE}} & \code{\#\back{ESC}} & \code{\#\back{ALTMODE}} & \code{\#\back{ALT}}\\ - \code{sp} & 32 & \code{\#\back{SPACE}} & \code{\#\back{SP}} & &\\ - \code{del} & 127 & \code{\#\back{DELETE}} & \code{\#\back{RUBOUT}} & &\\ - \hline - \end{tabular} - \caption{Characters recognized by \cmucl} - \label{tbl:chars} - \end{center} -\end{table} -\end{changebar} - -%%\node Array Initialization, , Characters, Data Types -\subsection{Array Initialization} - -If no \kwd{initial-value} is specified, arrays are initialized to zero. - - -%%\node Default Interrupts for Lisp, Packages, Data Types, Design Choices and Extensions -\section{Default Interrupts for Lisp} - -CMU Common Lisp has several interrupt handlers defined when it starts up, -as follows: -\begin{Lentry} - -\item[\code{SIGINT} (\ctrl{c})] causes Lisp to enter a break loop. - This puts you into the debugger which allows you to look at the - current state of the computation. If you proceed from the break - loop, the computation will proceed from where it was interrupted. - -\item[\code{SIGQUIT} (\ctrl{L})] causes Lisp to do a throw to the - top-level. This causes the current computation to be aborted, and - control returned to the top-level read-eval-print loop. - -\item[\code{SIGTSTP} (\ctrl{z})] causes Lisp to suspend execution and - return to the Unix shell. If control is returned to Lisp, the - computation will proceed from where it was interrupted. - -\item[\code{SIGILL}, \code{SIGBUS}, \code{SIGSEGV}, and \code{SIGFPE}] - cause Lisp to signal an error. -\end{Lentry} -For keyboard interrupt signals, the standard interrupt character is in -parentheses. Your \file{.login} may set up different interrupt -characters. When a signal is generated, there may be some delay before -it is processed since Lisp cannot be interrupted safely in an arbitrary -place. The computation will continue until a safe point is reached and -then the interrupt will be processed. \xlref{signal-handlers} to define -your own signal handlers. - -%%\node Packages, The Editor, Default Interrupts for Lisp, Design Choices and Extensions -\section{Packages} - -When CMU Common Lisp is first started up, the default package is the -\code{user} package. The \code{user} package uses the -\code{common-lisp}, \code{extensions}, and \code{pcl} packages. The -symbols exported from these three packages can be referenced without -package qualifiers. This section describes packages which have -exported interfaces that may concern users. The numerous internal -packages which implement parts of the system are not described here. -Package nicknames are in parenthesis after the full name. -\begin{Lentry} -\item[\code{alien}, \code{c-call}] Export the features of the Alien - foreign data structure facility (\pxlref{aliens}.) - -\item[\code{pcl}] This package contains PCL (Portable CommonLoops), - which is a portable implementation of CLOS (the Common Lisp Object - System.) This implements most (but not all) of the features in the - CLOS chapter of \cltltwo. - -\item[\code{debug}] The \code{debug} package contains the command-line - oriented debugger. It exports utility various functions and - switches. - -\item[\code{debug-internals}] The \code{debug-internals} package - exports the primitives used to write debuggers. - \xlref{debug-internals}. - -\item[\code{extensions (ext)}] The \code{extensions} packages exports - local extensions to Common Lisp that are documented in this manual. - Examples include the \code{save-lisp} function and time parsing. - -\item[\code{hemlock (ed)}] The \code{hemlock} package contains all the - code to implement Hemlock commands. The \code{hemlock} package - currently exports no symbols. - -\item[\code{hemlock-internals (hi)}] The \code{hemlock-internals} - package contains code that implements low level primitives and - exports those symbols used to write Hemlock commands. - -\item[\code{keyword}] The \code{keyword} package contains keywords - (e.g., \kwd{start}). All symbols in the \code{keyword} package are - exported and evaluate to themselves (i.e., the value of the symbol - is the symbol itself). - -\item[\code{profile}] The \code{profile} package exports a simple - run-time profiling facility (\pxlref{profiling}). - -\item[\code{common-lisp (cl lisp)}] The \code{common-lisp} package - exports all the symbols defined by \i{Common Lisp: the Language} and - only those symbols. Strictly portable Lisp code will depend only on - the symbols exported from the \code{lisp} package. - -\item[\code{unix}, \code{mach}] These packages export system call - interfaces to generic BSD Unix and Mach (\pxlref{unix-interface}). - -\item[\code{system (sys)}] The \code{system} package contains - functions and information necessary for system interfacing. This - package is used by the \code{lisp} package and exports several - symbols that are necessary to interface to system code. - -\item[\code{common-lisp-user (user cl-user)}] The - \code{common-lisp-user} package is the default package and is where - a user's code and data is placed unless otherwise specified. This - package exports no symbols. - -\item[\code{xlib}] The \code{xlib} package contains the Common Lisp X - interface (CLX) to the X11 protocol. This is mostly Lisp code with - a couple of functions that are defined in C to connect to the - server. - -\item[\code{wire}] The \code{wire} package exports a remote procedure - call facility (\pxlref{remote}). -\end{Lentry} - - -%%\node The Editor, Garbage Collection, Packages, Design Choices and Extensions -\section{The Editor} - -The \code{ed} function invokes the Hemlock editor which is described -in \i{Hemlock User's Manual} and \i{Hemlock Command Implementor's - Manual}. Most users at CMU prefer to use Hemlock's slave \Llisp{} -mechanism which provides an interactive buffer for the -\code{read-eval-print} loop and editor commands for evaluating and -compiling text from a buffer into the slave \Llisp. Since the editor -runs in the \Llisp, using slaves keeps users from trashing their -editor by developing in the same \Llisp{} with \Hemlock. - - -%%\node Garbage Collection, Describe, The Editor, Design Choices and Extensions -\section{Garbage Collection} - -CMU Common Lisp uses a stop-and-copy garbage collector that compacts -the items in dynamic space every time it runs. Most users cause the -system to garbage collect (GC) frequently, long before space is -exhausted. With 16 or 24 megabytes of memory, causing GC's more -frequently on less garbage allows the system to GC without much (if -any) paging. - -\hide{ -With the default value for the following variable, you can expect a GC to take -about one minute of elapsed time on a 6 megabyte machine running X as well as -Lisp. On machines with 8 megabytes or more of memory a GC should run without -much (if any) paging. GC's run more frequently but tend to take only about 5 -seconds. -} - -The following functions invoke the garbage collector or control whether -automatic garbage collection is in effect: - -\begin{defun}{extensions:}{gc}{} - - This function runs the garbage collector. If - \code{ext:*gc-verbose*} is non-\nil, then it invokes - \code{ext:*gc-notify-before*} before GC'ing and - \code{ext:*gc-notify-after*} afterwards. -\end{defun} - -\begin{defun}{extensions:}{gc-off}{} - - This function inhibits automatic garbage collection. After calling - it, the system will not GC unless you call \code{ext:gc} or - \code{ext:gc-on}. -\end{defun} - -\begin{defun}{extensions:}{gc-on}{} - - This function reinstates automatic garbage collection. If the - system would have GC'ed while automatic GC was inhibited, then this - will call \code{ext:gc}. -\end{defun} - -%%\node -\subsection{GC Parameters} -The following variables control the behavior of the garbage collector: - -\begin{defvar}{extensions:}{bytes-consed-between-gcs} - - CMU Common Lisp automatically GC's whenever the amount of memory - allocated to dynamic objects exceeds the value of an internal - variable. After each GC, the system sets this internal variable to - the amount of dynamic space in use at that point plus the value of - the variable \code{ext:*bytes-consed-between-gcs*}. The default - value is 2000000. -\end{defvar} - -\begin{defvar}{extensions:}{gc-verbose} - - This variable controls whether \code{ext:gc} invokes the functions - in \code{ext:*gc-notify-before*} and - \code{ext:*gc-notify-after*}. If \code{*gc-verbose*} is \nil, - \code{ext:gc} foregoes printing any messages. The default value is - \code{T}. -\end{defvar} - -\begin{defvar}{extensions:}{gc-notify-before} - - This variable's value is a function that should notify the user that - the system is about to GC. It takes one argument, the amount of - dynamic space in use before the GC measured in bytes. The default - value of this variable is a function that prints a message similar - to the following: -\begin{display} - \b{[GC threshold exceeded with 2,107,124 bytes in use. Commencing GC.]} -\end{display} -\end{defvar} - -\begin{defvar}{extensions:}{gc-notify-after} - - This variable's value is a function that should notify the user when - a GC finishes. The function must take three arguments, the amount - of dynamic spaced retained by the GC, the amount of dynamic space - freed, and the new threshold which is the minimum amount of space in - use before the next GC will occur. All values are byte quantities. - The default value of this variable is a function that prints a - message similar to the following: - \begin{display} - \b{[GC completed with 25,680 bytes retained and 2,096,808 bytes freed.]} - \b{[GC will next occur when at least 2,025,680 bytes are in use.]} - \end{display} -\end{defvar} - -Note that a garbage collection will not happen at exactly the new -threshold printed by the default \code{ext:*gc-notify-after*} -function. The system periodically checks whether this threshold has -been exceeded, and only then does a garbage collection. - -\begin{defvar}{extensions:}{gc-inhibit-hook} - - This variable's value is either a function of one argument or \nil. - When the system has triggered an automatic GC, if this variable is a - function, then the system calls the function with the amount of - dynamic space currently in use (measured in bytes). If the function - returns \nil, then the GC occurs; otherwise, the system inhibits - automatic GC as if you had called \code{ext:gc-off}. The writer of - this hook is responsible for knowing when automatic GC has been - turned off and for calling or providing a way to call - \code{ext:gc-on}. The default value of this variable is \nil. -\end{defvar} - -\begin{defvar}{extensions:}{before-gc-hooks} - \defvarx[extensions:]{after-gc-hooks} - - These variables' values are lists of functions to call before or - after any GC occurs. The system provides these purely for - side-effect, and the functions take no arguments. -\end{defvar} - -%%\node -\subsection{Weak Pointers} - -A weak pointer provides a way to maintain a reference to an object -without preventing an object from being garbage collected. If the -garbage collector discovers that the only pointers to an object are -weak pointers, then it breaks the weak pointers and deallocates the -object. - -\begin{defun}{extensions:}{make-weak-pointer}{\args{\var{object}}} - \defunx[extensions:]{weak-pointer-value}{\args{\var{weak-pointer}}} - - \code{make-weak-pointer} returns a weak pointer to an object. - \code{weak-pointer-value} follows a weak pointer, returning the two - values: the object pointed to (or \false{} if broken) and a boolean - value which is true if the pointer has been broken. -\end{defun} - -%%\node -\subsection{Finalization} - -Finalization provides a ``hook'' that is triggered when the garbage -collector reclaims an object. It is usually used to recover non-Lisp -resources that were allocated to implement the finalized Lisp object. -For example, when a unix file-descriptor stream is collected, -finalization is used to close the underlying file descriptor. - -\begin{defun}{extensions:}{finalize}{\args{\var{object} \var{function}}} - - This function registers \var{object} for finalization. - \var{function} is called with no arguments when \var{object} is - reclaimed. Normally \var{function} will be a closure over the - underlying state that needs to be freed, e.g. the unix file - descriptor in the fd-stream case. Note that \var{function} must not - close over \var{object} itself, as this prevents the object from - ever becoming garbage. -\end{defun} - -\begin{defun}{extensions:}{cancel-finalization}{\args{\var{object}}} - - This function cancel any finalization request for \var{object}. -\end{defun} - -%%\node Describe, The Inspector, Garbage Collection, Design Choices and Extensions -\section{Describe} - -In addition to the basic function described below, there are a number of -switches and other things that can be used to control \code{describe}'s -behavior. - -\begin{defun}{}{describe}{ \args{\var{object} \&optional{} \var{stream}}} - - The \code{describe} function prints useful information about - \var{object} on \var{stream}, which defaults to - \code{*standard-output*}. For any object, \code{describe} will - print out the type. Then it prints other information based on the - type of \var{object}. The types which are presently handled are: - - \begin{Lentry} - - \item[\tindexed{hash-table}] \code{describe} prints the number of - entries currently in the hash table and the number of buckets - currently allocated. - - \item[\tindexed{function}] \code{describe} prints a list of the - function's name (if any) and its formal parameters. If the name - has function documentation, then it will be printed. If the - function is compiled, then the file where it is defined will be - printed as well. - - \item[\tindexed{fixnum}] \code{describe} prints whether the integer - is prime or not. - - \item[\tindexed{symbol}] The symbol's value, properties, and - documentation are printed. If the symbol has a function - definition, then the function is described. - \end{Lentry} - If there is anything interesting to be said about some component of - the object, describe will invoke itself recursively to describe that - object. The level of recursion is indicated by indenting output. -\end{defun} - -\begin{defvar}{extensions:}{describe-level} - - The maximum level of recursive description allowed. Initially two. -\end{defvar} - -\begin{defvar}{extensions:}{describe-indentation} - -The number of spaces to indent for each level of recursive -description, initially three. -\end{defvar} - -\begin{defvar}{extensions:}{describe-print-level} - \defvarx[extensions:]{describe-print-length} - - The values of \code{*print-level*} and \code{*print-length*} during - description. Initially two and five. -\end{defvar} - -%%\node The Inspector, Load, Describe, Design Choices and Extensions -\section{The Inspector} - -\cmucl{} has both a graphical inspector that uses X windows and a simple -terminal-based inspector. - -\begin{defun}{}{inspect}{ \args{\ampoptional{} \var{object}}} - - \code{inspect} calls the inspector on the optional argument - \var{object}. If \var{object} is unsupplied, \code{inspect} - immediately returns \false. Otherwise, the behavior of inspect - depends on whether Lisp is running under X. When \code{inspect} is - eventually exited, it returns some selected Lisp object. -\end{defun} - -\begin{comment} -* The Graphical Interface:: -* The TTY Inspector:: -\end{comment} - -%%\node The Graphical Interface, The TTY Inspector, The Inspector, The Inspector -\subsection{The Graphical Interface} -\label{motif-interface} - -CMU Common Lisp has an interface to Motif which is functionally similar to -CLM, but works better in CMU CL. See: -\begin{example} -\file{doc/motif-toolkit.doc} -\file{doc/motif-internals.doc} -\end{example} - -This motif interface has been used to write the inspector and graphical -debugger. There is also a Lisp control panel with a simple file management -facility, apropos and inspector dialogs, and controls for setting global -options. See the \code{interface} and \code{toolkit} packages. - -\begin{defun}{interface:}{lisp-control-panel}{} - - This function creates a control panel for the Lisp process. -\end{defun} - -\begin{defvar}{interface:}{interface-style} - - When the graphical interface is loaded, this variable controls - whether it is used by \code{inspect} and the error system. If the - value is \kwd{graphics} (the default) and the \code{DISPLAY} - environment variable is defined, the graphical inspector and - debugger will be invoked by \findexed{inspect} or when an error is - signalled. Possible values are \kwd{graphics} and {tty}. If the - value is \kwd{graphics}, but there is no X display, then we quietly - use the TTY interface. -\end{defvar} - -%%\node The TTY Inspector, , The Graphical Interface, The Inspector -\subsection{The TTY Inspector} - -If X is unavailable, a terminal inspector is invoked. The TTY inspector -is a crude interface to \code{describe} which allows objects to be -traversed and maintains a history. This inspector prints information -about and object and a numbered list of the components of the object. -The command-line based interface is a normal -\code{read}--\code{eval}--\code{print} loop, but an integer \var{n} -descends into the \var{n}'th component of the current object, and -symbols with these special names are interpreted as commands: -\begin{Lentry} -\item[U] Move back to the enclosing object. As you descend into the -components of an object, a stack of all the objects previously seen is -kept. This command pops you up one level of this stack. - -\item[Q, E] Return the current object from \code{inspect}. - -\item[R] Recompute object display, and print again. Useful if the -object may have changed. - -\item[D] Display again without recomputing. - -\item[H, ?] Show help message. -\end{Lentry} - -%%\node Load, The Reader, The Inspector, Design Choices and Extensions -\section{Load} - -\begin{defun}{}{load}{% - \args{\var{filename} - \keys{\kwd{verbose} \kwd{print} \kwd{if-does-not-exist}} - \morekeys{\kwd{if-source-newer} \kwd{contents}}}} - - As in standard Common Lisp, this function loads a file containing - source or object code into the running Lisp. Several CMU extensions - have been made to \code{load} to conveniently support a variety of - program file organizations. \var{filename} may be a wildcard - pathname such as \file{*.lisp}, in which case all matching files are - loaded. - - If \var{filename} has a \code{pathname-type} (or extension), then - that exact file is loaded. If the file has no extension, then this - tells \code{load} to use a heuristic to load the ``right'' file. - The \code{*load-source-types*} and \code{*load-object-types*} - variables below are used to determine the default source and object - file types. If only the source or the object file exists (but not - both), then that file is quietly loaded. Similarly, if both the - source and object file exist, and the object file is newer than the - source file, then the object file is loaded. The value of the - \var{if-source-newer} argument is used to determine what action to - take when both the source and object files exist, but the object - file is out of date: - \begin{Lentry} - \item[\kwd{load-object}] The object file is loaded even though the - source file is newer. - - \item[\kwd{load-source}] The source file is loaded instead of the - older object file. - - \item[\kwd{compile}] The source file is compiled and then the new - object file is loaded. - - \item[\kwd{query}] The user is asked a yes or no question to - determine whether the source or object file is loaded. - \end{Lentry} - This argument defaults to the value of - \code{ext:*load-if-source-newer*} (initially \kwd{load-object}.) - - The \var{contents} argument can be used to override the heuristic - (based on the file extension) that normally determines whether to - load the file as a source file or an object file. If non-null, this - argument must be either \kwd{source} or \kwd{binary}, which forces - loading in source and binary mode, respectively. You really - shouldn't ever need to use this argument. -\end{defun} - -\begin{defvar}{extensions:}{load-source-types} - \defvarx[extensions:]{load-object-types} - - These variables are lists of possible \code{pathname-type} values - for source and object files to be passed to \code{load}. These - variables are only used when the file passed to \code{load} has no - type; in this case, the possible source and object types are used to - default the type in order to determine the names of the source and - object files. -\end{defvar} - -\begin{defvar}{extensions:}{load-if-source-newer} - - This variable determines the default value of the - \var{if-source-newer} argument to \code{load}. Its initial value is - \kwd{load-object}. -\end{defvar} - -%%\node The Reader, Stream Extensions, Load, Design Choices and Extensions -\section{The Reader} - -\begin{defvar}{extensions:}{ignore-extra-close-parentheses} - - If this variable is \true{} (the default), then the reader merely - prints a warning when an extra close parenthesis is detected - (instead of signalling an error.) -\end{defvar} - -%%\node Stream Extensions, Running Programs from Lisp, The Reader, Design Choices and Extensions -\section{Stream Extensions} -\begin{defun}{extensions:}{read-n-bytes}{% - \args{\var{stream buffer start numbytes} - \ampoptional{} \var{eof-error-p}}} - - On streams that support it, this function reads multiple bytes of - data into a buffer. The buffer must be a \code{simple-string} or - \code{(simple-array (unsigned-byte 8) (*))}. The argument - \var{nbytes} specifies the desired number of bytes, and the return - value is the number of bytes actually read. - \begin{itemize} - \item If \var{eof-error-p} is true, an \tindexed{end-of-file} - condition is signalled if end-of-file is encountered before - \var{count} bytes have been read. - - \item If \var{eof-error-p} is false, \code{read-n-bytes reads} as - much data is currently available (up to count bytes.) On pipes or - similar devices, this function returns as soon as any data is - available, even if the amount read is less than \var{count} and - eof has not been hit. See also \funref{make-fd-stream}. - \end{itemize} -\end{defun} -%%\node Running Programs from Lisp, Saving a Core Image, The Reader, Design Choices and Extensions -\section{Running Programs from Lisp} - -It is possible to run programs from Lisp by using the following function. - -\begin{defun}{extensions:}{run-program}{% - \args{\var{program} \var{args} - \keys{\kwd{env} \kwd{wait} \kwd{pty} \kwd{input}} - \morekeys{\kwd{if-input-does-not-exist}} - \yetmorekeys{\kwd{output} \kwd{if-output-exists}} - \yetmorekeys{\kwd{error} \kwd{if-error-exists}} - \yetmorekeys{\kwd{status-hook} \kwd{before-execve}}}} - - \code{run-program} runs \var{program} in a child process. - \var{Program} should be a pathname or string naming the program. - \var{Args} should be a list of strings which this passes to - \var{program} as normal Unix parameters. For no arguments, specify - \var{args} as \nil. The value returned is either a process - structure or \nil. The process interface follows the description of - \code{run-program}. If \code{run-program} fails to fork the child - process, it returns \nil. - - Except for sharing file descriptors as explained in keyword argument - descriptions, \code{run-program} closes all file descriptors in the - child process before running the program. When you are done using a - process, call \code{process-close} to reclaim system resources. You - only need to do this when you supply \kwd{stream} for one of - \kwd{input}, \kwd{output}, or \kwd{error}, or you supply \kwd{pty} - non-\nil. You can call \code{process-close} regardless of whether - you must to reclaim resources without penalty if you feel safer. - - \code{run-program} accepts the following keyword arguments: - \begin{Lentry} - - \item[\kwd{env}] This is an a-list mapping keywords and - simple-strings. The default is \code{ext:*environment-list*}. If - \kwd{env} is specified, \code{run-program} uses the value given - and does not combine the environment passed to Lisp with the one - specified. - - \item[\kwd{wait}] If non-\nil{} (the default), wait until the child - process terminates. If \nil, continue running Lisp while the - child process runs. - - \item[\kwd{pty}] This should be one of \true, \nil, or a stream. If - specified non-\nil, the subprocess executes under a Unix \i{PTY}. - If specified as a stream, the system collects all output to this - pty and writes it to this stream. If specified as \true, the - \code{process-pty} slot contains a stream from which you can read - the program's output and to which you can write input for the - program. The default is \nil. - - \item[\kwd{input}] This specifies how the program gets its input. - If specified as a string, it is the name of a file that contains - input for the child process. \code{run-program} opens the file as - standard input. If specified as \nil{} (the default), then - standard input is the file \file{/dev/null}. If specified as - \true, the program uses the current standard input. This may - cause some confusion if \kwd{wait} is \nil{} since two processes - may use the terminal at the same time. If specified as - \kwd{stream}, then the \code{process-input} slot contains an - output stream. Anything written to this stream goes to the - program as input. \kwd{input} may also be an input stream that - already contains all the input for the process. In this case - \code{run-program} reads all the input from this stream before - returning, so this cannot be used to interact with the process. - - \item[\kwd{if-input-does-not-exist}] This specifies what to do if - the input file does not exist. The following values are valid: - \nil{} (the default) causes \code{run-program} to return \nil{} - without doing anything; \kwd{create} creates the named file; and - \kwd{error} signals an error. - - \item[\kwd{output}] This specifies what happens with the program's - output. If specified as a pathname, it is the name of a file that - contains output the program writes to its standard output. If - specified as \nil{} (the default), all output goes to - \file{/dev/null}. If specified as \true, the program writes to - the Lisp process's standard output. This may cause confusion if - \kwd{wait} is \nil{} since two processes may write to the terminal - at the same time. If specified as \kwd{stream}, then the - \code{process-output} slot contains an input stream from which you - can read the program's output. - - \item[\kwd{if-output-exists}] This specifies what to do if the - output file already exists. The following values are valid: - \nil{} causes \code{run-program} to return \nil{} without doing - anything; \kwd{error} (the default) signals an error; - \kwd{supersede} overwrites the current file; and \kwd{append} - appends all output to the file. - - \item[\kwd{error}] This is similar to \kwd{output}, except the file - becomes the program's standard error. Additionally, \kwd{error} - can be \kwd{output} in which case the program's error output is - routed to the same place specified for \kwd{output}. If specified - as \kwd{stream}, the \code{process-error} contains a stream - similar to the \code{process-output} slot when specifying the - \kwd{output} argument. - - \item[\kwd{if-error-exists}] This specifies what to do if the error - output file already exists. It accepts the same values as - \kwd{if-output-exists}. - - \item[\kwd{status-hook}] This specifies a function to call whenever - the process changes status. This is especially useful when - specifying \kwd{wait} as \nil. The function takes the process as - a required argument. - - \item[\kwd{before-execve}] This specifies a function to run in the - child process before it becomes the program to run. This is - useful for actions such as authenticating the child process - without modifying the parent Lisp process. - \end{Lentry} -\end{defun} - - -\begin{comment} -* Process Accessors:: -\end{comment} - -%%\node Process Accessors, , Running Programs from Lisp, Running Programs from Lisp -\subsection{Process Accessors} - -The following functions interface the process returned by \code{run-program}: - -\begin{defun}{extensions:}{process-p}{\args{\var{thing}}} - - This function returns \true{} if \var{thing} is a process. - Otherwise it returns \nil{} -\end{defun} - -\begin{defun}{extensions:}{process-pid}{\args{\var{process}}} - - This function returns the process ID, an integer, for the - \var{process}. -\end{defun} - -\begin{defun}{extensions:}{process-status}{\args{\var{process}}} - - This function returns the current status of \var{process}, which is - one of \kwd{running}, \kwd{stopped}, \kwd{exited}, or - \kwd{signaled}. -\end{defun} - -\begin{defun}{extensions:}{process-exit-code}{\args{\var{process}}} - - This function returns either the exit code for \var{process}, if it - is \kwd{exited}, or the termination signal \var{process} if it is - \kwd{signaled}. The result is undefined for processes that are - still alive. -\end{defun} - -\begin{defun}{extensions:}{process-core-dumped}{\args{\var{process}}} - - This function returns \true{} if someone used a Unix signal to - terminate the \var{process} and caused it to dump a Unix core image. -\end{defun} - -\begin{defun}{extensions:}{process-pty}{\args{\var{process}}} - - This function returns either the two-way stream connected to - \var{process}'s Unix \i{PTY} connection or \nil{} if there is none. -\end{defun} - -\begin{defun}{extensions:}{process-input}{\args{\var{process}}} - \defunx[extensions:]{process-output}{\args{\var{process}}} - \defunx[extensions:]{process-error}{\args{\var{process}}} - - If the corresponding stream was created, these functions return the - input, output or error file descriptor. \nil{} is returned if there - is no stream. -\end{defun} - -\begin{defun}{extensions:}{process-status-hook}{\args{\var{process}}} - - This function returns the current function to call whenever - \var{process}'s status changes. This function takes the - \var{process} as a required argument. \code{process-status-hook} is - \code{setf}'able. -\end{defun} - -\begin{defun}{extensions:}{process-plist}{\args{\var{process}}} - - This function returns annotations supplied by users, and it is - \code{setf}'able. This is available solely for users to associate - information with \var{process} without having to build a-lists or - hash tables of process structures. -\end{defun} - -\begin{defun}{extensions:}{process-wait}{ - \args{\var{process} \ampoptional{} \var{check-for-stopped}}} - - This function waits for \var{process} to finish. If - \var{check-for-stopped} is non-\nil, this also returns when - \var{process} stops. -\end{defun} - -\begin{defun}{extensions:}{process-kill}{% - \args{\var{process} \var{signal} \ampoptional{} \var{whom}}} - - This function sends the Unix \var{signal} to \var{process}. - \var{Signal} should be the number of the signal or a keyword with - the Unix name (for example, \kwd{sigsegv}). \var{Whom} should be - one of the following: - \begin{Lentry} - - \item[\kwd{pid}] This is the default, and it indicates sending the - signal to \var{process} only. - - \item[\kwd{process-group}] This indicates sending the signal to - \var{process}'s group. - - \item[\kwd{pty-process-group}] This indicates sending the signal to - the process group currently in the foreground on the Unix \i{PTY} - connected to \var{process}. This last option is useful if the - running program is a shell, and you wish to signal the program - running under the shell, not the shell itself. If - \code{process-pty} of \var{process} is \nil, using this option is - an error. - \end{Lentry} -\end{defun} - -\begin{defun}{extensions:}{process-alive-p}{\args{\var{process}}} - - This function returns \true{} if \var{process}'s status is either - \kwd{running} or \kwd{stopped}. -\end{defun} - -\begin{defun}{extensions:}{process-close}{\args{\var{process}}} - - This function closes all the streams associated with \var{process}. - When you are done using a process, call this to reclaim system - resources. -\end{defun} - - -%%\node Saving a Core Image, Pathnames, Running Programs from Lisp, Design Choices and Extensions -\section{Saving a Core Image} - -A mechanism has been provided to save a running Lisp core image and to -later restore it. This is convenient if you don't want to load several files -into a Lisp when you first start it up. The main problem is the large -size of each saved Lisp image, typically at least 20 megabytes. - -\begin{defun}{extensions:}{save-lisp}{% - \args{\var{file} - \keys{\kwd{purify} \kwd{root-structures} \kwd{init-function}} - \morekeys{\kwd{load-init-file} \kwd{print-herald} \kwd{site-init}} - \yetmorekeys{\kwd{process-command-line}}}} - - The \code{save-lisp} function saves the state of the currently - running Lisp core image in \var{file}. The keyword arguments have - the following meaning: - \begin{Lentry} - - \item[\kwd{purify}] If non-NIL (the default), the core image is - purified before it is saved (see \funref{purify}.) This reduces - the amount of work the garbage collector must do when the - resulting core image is being run. Also, if more than one Lisp is - running on the same machine, this maximizes the amount of memory - that can be shared between the two processes. - - \item[\kwd{root-structures}] - \begin{changebar} - This should be a list of the main entry points in any newly - loaded systems. This need not be supplied, but locality and/or - GC performance will be better if they are. Meaningless if - \kwd{purify} is \nil. See \funref{purify}. - \end{changebar} - - \item[\kwd{init-function}] This is the function that starts running - when the created core file is resumed. The default function - simply invokes the top level read-eval-print loop. If the - function returns the lisp will exit. - - \item[\kwd{load-init-file}] If non-NIL, then load an init file; - either the one specified on the command line or - ``\w{\file{init.}\var{fasl-type}}'', or, if - ``\w{\file{init.}\var{fasl-type}}'' does not exist, - \code{init.lisp} from the user's home directory. If the init file - is found, it is loaded into the resumed core file before the - read-eval-print loop is entered. - - \item[\kwd{site-init}] If non-NIL, the name of the site init file to - quietly load. The default is \file{library:site-init}. No error - is signalled if the file does not exist. - - \item[\kwd{print-herald}] If non-NIL (the default), then print out - the standard Lisp herald when starting. - - \item[\kwd{process-command-line}] If non-NIL (the default), - processes the command line switches and performs the appropriate - actions. - \end{Lentry} -\end{defun} - -To resume a saved file, type: -\begin{example} -lisp -core file -\end{example} - -\begin{defun}{extensions:}{purify}{ - \args{\var{file} - \keys{\kwd{root-structures} \kwd{environment-name}}}} - - This function optimizes garbage collection by moving all currently - live objects into non-collected storage. Once statically allocated, - the objects can never be reclaimed, even if all pointers to them are - dropped. This function should generally be called after a large - system has been loaded and initialized. - - \begin{Lentry} - \item[\kwd{root-structures}] is an optional list of objects which - should be copied first to maximize locality. This should be a - list of the main entry points for the resulting core image. The - purification process tries to localize symbols, functions, etc., - in the core image so that paging performance is improved. The - default value is NIL which means that Lisp objects will still be - localized but probably not as optimally as they could be. - - \var{defstruct} structures defined with the \code{(:pure t)} - option are moved into read-only storage, further reducing GC cost. - List and vector slots of pure structures are also moved into - read-only storage. - - \item[\kwd{environment-name}] is gratuitous documentation for the - compacted version of the current global environment (as seen in - \code{c::*info-environment*}.) If \false{} is supplied, then - environment compaction is inhibited. - \end{Lentry} -\end{defun} - -%%\node Pathnames, Filesystem Operations, Saving a Core Image, Design Choices and Extensions -\section{Pathnames} - -In \clisp{} quite a few aspects of \tindexed{pathname} semantics are left to -the implementation. - -\begin{comment} -* Unix Pathnames:: -* Wildcard Pathnames:: -* Logical Pathnames:: -* Search Lists:: -* Predefined Search-Lists:: -* Search-List Operations:: -* Search List Example:: -\end{comment} - -%%\node Unix Pathnames, Wildcard Pathnames, Pathnames, Pathnames -\subsection{Unix Pathnames} -\cpsubindex{unix}{pathnames} - -Unix pathnames are always parsed with a \code{unix-host} object as the host and -\code{nil} as the device. The last two dots (\code{.}) in the namestring mark -the type and version, however if the first character is a dot, it is considered -part of the name. If the last character is a dot, then the pathname has the -empty-string as its type. The type defaults to \code{nil} and the version -defaults to \kwd{newest}. -\begin{example} -(defun parse (x) - (values (pathname-name x) (pathname-type x) (pathname-version x))) - -(parse "foo") \result "foo", NIL, :NEWEST -(parse "foo.bar") \result "foo", "bar", :NEWEST -(parse ".foo") \result ".foo", NIL, :NEWEST -(parse ".foo.bar") \result ".foo", "bar", :NEWEST -(parse "..") \result ".", "", :NEWEST -(parse "foo.") \result "foo", "", :NEWEST -(parse "foo.bar.1") \result "foo", "bar", 1 -(parse "foo.bar.baz") \result "foo.bar", "baz", :NEWEST -\end{example} - -The directory of pathnames beginning with a slash (or a search-list, -\pxlref{search-lists}) is starts \kwd{absolute}, others start with -\kwd{relative}. The \code{..} directory is parsed as \kwd{up}; there is no -namestring for \kwd{back}: -\begin{example} -(pathname-directory "/usr/foo/bar.baz") \result (:ABSOLUTE "usr" "foo") -(pathname-directory "../foo/bar.baz") \result (:RELATIVE :UP "foo") -\end{example} - -%%\node Wildcard Pathnames, Logical Pathnames, Unix Pathnames, Pathnames -\subsection{Wildcard Pathnames} - -Wildcards are supported in Unix pathnames. If `\code{*}' is specified for a -part of a pathname, that is parsed as \kwd{wild}. `\code{**}' can be used as a -directory name to indicate \kwd{wild-inferiors}. Filesystem operations -treat \kwd{wild-inferiors} the same as\ \kwd{wild}, but pathname pattern -matching (e.g. for logical pathname translation, \pxlref{logical-pathnames}) -matches any number of directory parts with `\code{**}' (see -\pxlref{wildcard-matching}.) - - -`\code{*}' embedded in a pathname part matches any number of characters. -Similarly, `\code{?}' matches exactly one character, and `\code{[a,b]}' -matches the characters `\code{a}' or `\code{b}'. These pathname parts are -parsed as \code{pattern} objects. - -Backslash can be used as an escape character in namestring -parsing to prevent the next character from being treated as a wildcard. Note -that if typed in a string constant, the backslash must be doubled, since the -string reader also uses backslash as a quote: -\begin{example} -(pathname-name "foo\(\backslash\backslash\)*bar") => "foo*bar" -\end{example} - -%%\node Logical Pathnames, Search Lists, Wildcard Pathnames, Pathnames -\subsection{Logical Pathnames} -\cindex{logical pathnames} -\label{logical-pathnames} - -If a namestring begins with the name of a defined logical pathname -host followed by a colon, then it will be parsed as a logical -pathname. Both `\code{*}' and `\code{**}' wildcards are implemented. -\findexed{load-logical-pathname-defaults} on \var{name} looks for a -logical host definition file in -\w{\file{library:\var{name}.translations}}. Note that \file{library:} -designates the search list (\pxlref{search-lists}) initialized to the -\cmucl{} \file{lib/} directory, not a logical pathname. The format of -the file is a single list of two-lists of the from and to patterns: -\begin{example} -(("foo;*.text" "/usr/ram/foo/*.txt") - ("foo;*.lisp" "/usr/ram/foo/*.l")) -\end{example} - -\begin{comment} -* Search Lists:: -* Search List Example:: -\end{comment} - -%%\node Search Lists, Predefined Search-Lists, Logical Pathnames, Pathnames -\subsection{Search Lists} -\cindex{search lists} -\label{search-lists} - -Search lists are an extension to Common Lisp pathnames. They serve a function -somewhat similar to Common Lisp logical pathnames, but work more like Unix PATH -variables. Search lists are used for two purposes: -\begin{itemize} -\item They provide a convenient shorthand for commonly used directory names, -and - -\item They allow the abstract (directory structure independent) specification -of file locations in program pathname constants (similar to logical pathnames.) -\end{itemize} -Each search list has an associated list of directories (represented as -pathnames with no name or type component.) The namestring for any relative -pathname may be prefixed with ``\var{slist}\code{:}'', indicating that the -pathname is relative to the search list \var{slist} (instead of to the current -working directory.) Once qualified with a search list, the pathname is no -longer considered to be relative. - -When a search list qualified pathname is passed to a file-system operation such -as \code{open}, \code{load} or \code{truename}, each directory in the search -list is successively used as the root of the pathname until the file is -located. When a file is written to a search list directory, the file is always -written to the first directory in the list. - -%%\node Predefined Search-Lists, Search-List Operations, Search Lists, Pathnames -\subsection{Predefined Search-Lists} - -These search-lists are initialized from the Unix environment or when Lisp was -built: -\begin{Lentry} -\item[\code{default:}] The current directory at startup. - -\item[\code{home:}] The user's home directory. - -\item[\code{library:}] The \cmucl{} \file{lib/} directory (\code{CMUCLLIB} environment -variable.) - -\item[\code{path:}] The Unix command path (\code{PATH} environment variable.) - -\item[\code{target:}] The root of the tree where \cmucl{} was compiled. -\end{Lentry} -It can be useful to redefine these search-lists, for example, \file{library:} -can be augmented to allow logical pathname translations to be located, and -\file{target:} can be redefined to point to where \cmucl{} system sources are -locally installed. - -%%\node Search-List Operations, Search List Example, Predefined Search-Lists, Pathnames -\subsection{Search-List Operations} - -These operations define and access search-list definitions. A search-list name -may be parsed into a pathname before the search-list is actually defined, but -the search-list must be defined before it can actually be used in a filesystem -operation. - -\begin{defun}{extensions:}{search-list}{\var{name}} - - This function returns the list of directories associated with the - search list \var{name}. If \var{name} is not a defined search list, - then an error is signaled. When set with \code{setf}, the list of - directories is changed to the new value. If the new value is just a - namestring or pathname, then it is interpreted as a one-element - list. Note that (unlike Unix pathnames), search list names are - case-insensitive. -\end{defun} - -\begin{defun}{extensions:}{search-list-defined-p}{\var{name}} - \defunx[extensions:]{clear-search-list}{\var{name}} - - \code{search-list-defined-p} returns \true{} if \var{name} is a - defined search list name, \false{} otherwise. - \code{clear-search-list} make the search list \var{name} undefined. -\end{defun} - -\begin{defmac}{extensions:}{enumerate-search-list}{% - \args{(\var{var} \var{pathname} \mopt{result}) \mstar{form}}} - - This macro provides an interface to search list resolution. The - body \var{forms} are executed with \var{var} bound to each - successive possible expansion for \var{name}. If \var{name} does - not contain a search-list, then the body is executed exactly once. - Everything is wrapped in a block named \nil, so \code{return} can be - used to terminate early. The \var{result} form (default \nil) is - evaluated to determine the result of the iteration. -\end{defmac} - -\begin{comment} -* Search List Example:: -\end{comment} - -%%\node Search List Example, , Search-List Operations, Pathnames -\subsection{Search List Example} - -The search list \code{code:} can be defined as follows: -\begin{example} -(setf (ext:search-list "code:") '("/usr/lisp/code/")) -\end{example} -It is now possible to use \code{code:} as an abbreviation for the directory -\file{/usr/lisp/code/} in all file operations. For example, you can now specify -\code{code:eval.lisp} to refer to the file \file{/usr/lisp/code/eval.lisp}. - -To obtain the value of a search-list name, use the function search-list -as follows: -\begin{example} -(ext:search-list \var{name}) -\end{example} -Where \var{name} is the name of a search list as described above. For example, -calling \code{ext:search-list} on \code{code:} as follows: -\begin{example} -(ext:search-list "code:") -\end{example} -returns the list \code{("/usr/lisp/code/")}. - -%%\node Filesystem Operations, Time Parsing and Formatting, Pathnames, Design Choices and Extensions -\section{Filesystem Operations} - -\cmucl{} provides a number of extensions and optional features beyond those -require by \clisp. - -\begin{comment} -* Wildcard Matching:: -* File Name Completion:: -* Miscellaneous Filesystem Operations:: -\end{comment} - -%%\node Wildcard Matching, File Name Completion, Filesystem Operations, Filesystem Operations -\subsection{Wildcard Matching} -\label{wildcard-matching} - -Unix filesystem operations such as \code{open} will accept wildcard pathnames -that match a single file (of course, \code{directory} allows any number of -matches.) Filesystem operations treat \kwd{wild-inferiors} the same as\ -\kwd{wild}. - -\begin{defun}{}{directory}{\var{wildname} \keys{\kwd{all} \kwd{check-for-subdirs}} - \morekeys{\kwd{follow-links}}} - - The keyword arguments to this \clisp{} function are a CMU extension. - The arguments (all default to \code{t}) have the following - functions: - \begin{Lentry} - \item[\kwd{all}] Include files beginning with dot such as - \file{.login}, similar to ``\code{ls -a}''. - - \item[\kwd{check-for-subdirs}] Test whether files are directories, - similar to ``\code{ls -F}''. - - \item[\kwd{follow-links}] Call \code{truename} on each file, which - expands out all symbolic links. Note that this option can easily - result in pathnames being returned which have a different - directory from the one in the \var{wildname} argument. - \end{Lentry} -\end{defun} - -\begin{defun}{extensions:}{print-directory}{% - \args{\var{wildname} - \ampoptional{} \var{stream} - \keys{\kwd{all} \kwd{verbose}} - \morekeys{\kwd{return-list}}}} - - Print a directory of \var{wildname} listing to \var{stream} (default - \code{*standard-output*}.) \kwd{all} and \kwd{verbose} both default - to \false{} and correspond to the ``\code{-a}'' and ``\code{-l}'' - options of \file{ls}. Normally this function returns \false{}, but - if \kwd{return-list} is true, a list of the matched pathnames are - returned. -\end{defun} - -%%\node File Name Completion, Miscellaneous Filesystem Operations, Wildcard Matching, Filesystem Operations -\subsection{File Name Completion} - -\begin{defun}{extensions:}{complete-file}{% - \args{\var{pathname} - \keys{\kwd{defaults} \kwd{ignore-types}}}} - - Attempt to complete a file name to the longest unambiguous prefix. - If supplied, directory from \kwd{defaults} is used as the ``working - directory'' when doing completion. \kwd{ignore-types} is a list of - strings of the pathname types (a.k.a. extensions) that should be - disregarded as possible matches (binary file names, etc.) -\end{defun} - -\begin{defun}{extensions:}{ambiguous-files}{% - \args{\var{pathname} - \ampoptional{} \var{defaults}}} - - Return a list of pathnames for all the possible completions of - \var{pathname} with respect to \var{defaults}. -\end{defun} - -%%\node Miscellaneous Filesystem Operations, , File Name Completion, Filesystem Operations -\subsection{Miscellaneous Filesystem Operations} - -\begin{defun}{extensions:}{default-directory}{} - - Return the current working directory as a pathname. If set with - \code{setf}, set the working directory. -\end{defun} - -\begin{defun}{extensions:}{file-writable}{\var{name}} - - This function accepts a pathname and returns \true{} if the current - process can write it, and \false{} otherwise. -\end{defun} - -\begin{defun}{extensions:}{unix-namestring}{% - \args{\var{pathname} - \ampoptional{} \var{for-input}}} - - This function converts \var{pathname} into a string that can be used - with UNIX system calls. Search-lists and wildcards are expanded. - \var{for-input} controls the treatment of search-lists: when true - (the default) and the file exists anywhere on the search-list, then - that absolute pathname is returned; otherwise the first element of - the search-list is used as the directory. -\end{defun} - -%%\node Time Parsing and Formatting, Lisp Library, Filesystem Operations, Design Choices and Extensions -\section{Time Parsing and Formatting} - -\cindex{time parsing} \cindex{time formatting} -Functions are provided to allow parsing strings containing time information -and printing time in various formats are available. - -\begin{defun}{extensions:}{parse-time}{% - \args{\var{time-string} - \keys{\kwd{error-on-mismatch} \kwd{default-seconds}} - \morekeys{\kwd{default-minutes} \kwd{default-hours}} - \yetmorekeys{\kwd{default-day} \kwd{default-month}} - \yetmorekeys{\kwd{default-year} \kwd{default-zone}} - \yetmorekeys{\kwd{default-weekday}}}} - - \code{parse-time} accepts a string containing a time (e.g., - \w{"\code{Jan 12, 1952}"}) and returns the universal time if it is - successful. If it is unsuccessful and the keyword argument - \kwd{error-on-mismatch} is non-\FALSE, it signals an error. - Otherwise it returns \FALSE. The other keyword arguments have the - following meaning: - \begin{Lentry} - - \item[\kwd{default-seconds}] specifies the default value for the - seconds value if one is not provided by \var{time-string}. The - default value is 0. - - \item[\kwd{default-minutes}] specifies the default value for the - minutes value if one is not provided by \var{time-string}. The - default value is 0. - - \item[\kwd{default-hours}] specifies the default value for the hours - value if one is not provided by \var{time-string}. The default - value is 0. - - \item[\kwd{default-day}] specifies the default value for the day - value if one is not provided by \var{time-string}. The default - value is the current day. - - \item[\kwd{default-month}] specifies the default value for the month - value if one is not provided by \var{time-string}. The default - value is the current month. - - \item[\kwd{default-year}] specifies the default value for the year - value if one is not provided by \var{time-string}. The default - value is the current year. - - \item[\kwd{default-zone}] specifies the default value for the time - zone value if one is not provided by \var{time-string}. The - default value is the current time zone. - - \item[\kwd{default-weekday}] specifies the default value for the day - of the week if one is not provided by \var{time-string}. The - default value is the current day of the week. - \end{Lentry} - Any of the above keywords can be given the value \kwd{current} which - means to use the current value as determined by a call to the - operating system. -\end{defun} - -\begin{defun}{extensions:}{format-universal-time}{ - \args{\var{dest} \var{universal-time} - \\ - \keys{\kwd{timezone}} - \morekeys{\kwd{style} \kwd{date-first}} - \yetmorekeys{\kwd{print-seconds} \kwd{print-meridian}} - \yetmorekeys{\kwd{print-timezone} \kwd{print-weekday}}}} - \defunx[extensions:]{format-decoded-time}{ - \args{\var{dest} \var{seconds} \var{minutes} \var{hours} \var{day} \var{month} \var{year} - \\ - \keys{\kwd{timezone}} - \morekeys{\kwd{style} \kwd{date-first}} - \yetmorekeys{\kwd{print-seconds} \kwd{print-meridian}} - \yetmorekeys{\kwd{print-timezone} \kwd{print-weekday}}}} - - \code{format-universal-time} formats the time specified by - \var{universal-time}. \code{format-decoded-time} formats the time - specified by \var{seconds}, \var{minutes}, \var{hours}, \var{day}, - \var{month}, and \var{year}. \var{Dest} is any destination - accepted by the \code{format} function. The keyword arguments have - the following meaning: - \begin{Lentry} - - \item[\kwd{timezone}] is an integer specifying the hours west of - Greenwich. \kwd{timezone} defaults to the current time zone. - - \item[\kwd{style}] specifies the style to use in formatting the - time. The legal values are: - \begin{Lentry} - - \item[\kwd{short}] specifies to use a numeric date. - - \item[\kwd{long}] specifies to format months and weekdays as - words instead of numbers. - - \item[\kwd{abbreviated}] is similar to long except the words are - abbreviated. - - \item[\kwd{government}] is similar to abbreviated, except the - date is of the form ``day month year'' instead of ``month day, - year''. - \end{Lentry} - - \item[\kwd{date-first}] if non-\false{} (default) will place the - date first. Otherwise, the time is placed first. - - \item[\kwd{print-seconds}] if non-\false{} (default) will format - the seconds as part of the time. Otherwise, the seconds will be - omitted. - - \item[\kwd{print-meridian}] if non-\false{} (default) will format - ``AM'' or ``PM'' as part of the time. Otherwise, the ``AM'' or - ``PM'' will be omitted. - - \item[\kwd{print-timezone}] if non-\false{} (default) will format - the time zone as part of the time. Otherwise, the time zone will - be omitted. - - %%\item[\kwd{print-seconds}] - %%if non-\false{} (default) will format the seconds as part of - %%the time. Otherwise, the seconds will be omitted. - - \item[\kwd{print-weekday}] if non-\false{} (default) will format - the weekday as part of date. Otherwise, the weekday will be - omitted. - \end{Lentry} -\end{defun} - -%% New stuff -\begin{changebar} -\section{Random Number Generation} -\cindex{random number generation} - -\clisp{} includes a random number generator as a standard part of the -language; however, the implementation of the generator is not -specified. Two random number generators are available in \cmucl{}, -depending on the version. - -\subsection{Original Generator} -\cpsubindex{random number generation}{original generator} -The default random number generator uses a lagged Fibonacci generator -given by -\begin{displaymath} - z[i] = z[i - 24] - z[i - 55] \bmod 536870908 -\end{displaymath} -where $z[i]$ is the $i$'th random number. This generator produces -small integer-valued numbers. For larger integer, the small random -integers are concatenated to produce larger integers. For -floating-point numbers, the bits from this generator are used as the -bits of the floating-point significand. - -\subsection{New Generator} -\cpsubindex{random number generation}{new generator} - -In some versions of \cmucl{}, the original generator above has been -replaced with a subtract-with-borrow generator -combined with a Weyl generator.\footnote{The generator described here - is available if the feature \kwd{new-random} is available.} The -reason for the change was to use a documented generator which has -passed tests for randomness. - -The subtract-with-borrow generator is described by the following -equation -\begin{displaymath} - z[i] = z[i + 20] - z[i + 5] - b -\end{displaymath} -where $z[i]$ is the $i$'th random number, which is a -\code{double-float}. All of the indices in this equation are -interpreted modulo 32. The quantity $b$ is carried over from the -previous iteration and is either 0 or \code{double-float-epsilon}. If -$z[i]$ is positive, $b$ is set to zero. Otherwise, $b$ is set to -\code{double-float-epsilon}. - -To increase the randomness of this generator, this generator is -combined with a Weyl generator defined by -\begin{displaymath} - x[i] = x[i - 1] - y \bmod 1, -\end{displaymath} -where $y = 7097293079245107 \times 2^{-53}$. Thus, the resulting -random number $r[i]$ is -\begin{displaymath} - r[i] = (z[i] - x[i]) \bmod 1 -\end{displaymath} - -This generator has been tested by Peter VanEynde using Marsaglia's -diehard test suite for random number generators; this generator -passes the test suite. - -This generator is designed for generating floating-point random -numbers. To obtain integers, the bits from the significand of the -floating-point number are used as the bits of the integer. As many -floating-point numbers as needed are generated to obtain the desired -number of bits in the random integer. - -For floating-point numbers, this generator can by significantly faster -than the original generator. -\end{changebar} - -%%\node Lisp Library, , Time Parsing and Formatting, Design Choices and Extensions -\section{Lisp Library} -\label{lisp-lib} - -The CMU Common Lisp project maintains a collection of useful or interesting -programs written by users of our system. The library is in -\file{lib/contrib/}. Two files there that users should read are: -\begin{Lentry} - -\item[CATALOG.TXT] -This file contains a page for each entry in the library. It -contains information such as the author, portability or dependency issues, how -to load the entry, etc. - -\item[READ-ME.TXT] -This file describes the library's organization and all the -possible pieces of information an entry's catalog description could contain. -\end{Lentry} - -Hemlock has a command \F{Library Entry} that displays a list of the current -library entries in an editor buffer. There are mode specific commands that -display catalog descriptions and load entries. This is a simple and convenient -way to browse the library. - - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/debug.ms} - - - -%%\node The Debugger, The Compiler, Design Choices and Extensions, Top -\chapter{The Debugger} \hide{-*- Dictionary: cmu-user -*-} -\begin{center} -\b{By Robert MacLachlan} -\end{center} -\cindex{debugger} -\label{debugger} - -\begin{comment} -* Debugger Introduction:: -* The Command Loop:: -* Stack Frames:: -* Variable Access:: -* Source Location Printing:: -* Compiler Policy Control:: -* Exiting Commands:: -* Information Commands:: -* Breakpoint Commands:: -* Function Tracing:: -* Specials:: -\end{comment} - -%%\node Debugger Introduction, The Command Loop, The Debugger, The Debugger -\section{Debugger Introduction} - -The \cmucl{} debugger is unique in its level of support for source-level -debugging of compiled code. Although some other debuggers allow access of -variables by name, this seems to be the first \llisp{} debugger that: -\begin{itemize} - -\item -Tells you when a variable doesn't have a value because it hasn't been -initialized yet or has already been deallocated, or - -\item -Can display the precise source location corresponding to a code -location in the debugged program. -\end{itemize} -These features allow the debugging of compiled code to be made almost -indistinguishable from interpreted code debugging. - -The debugger is an interactive command loop that allows a user to examine -the function call stack. The debugger is invoked when: -\begin{itemize} - -\item -A \tindexed{serious-condition} is signaled, and it is not handled, or - -\item -\findexed{error} is called, and the condition it signals is not handled, or - -\item -The debugger is explicitly invoked with the \clisp{} \findexed{break} -or \findexed{debug} functions. -\end{itemize} - -{\it Note: there are two debugger interfaces in CMU CL: the TTY debugger -(described below) and the Motif debugger. Since the difference is only in the -user interface, much of this chapter also applies to the Motif version. -\xlref{motif-interface} for a very brief discussion of the graphical -interface.} - -When you enter the TTY debugger, it looks something like this: -\begin{example} -Error in function CAR. -Wrong type argument, 3, should have been of type LIST. - -Restarts: - 0: Return to Top-Level. - -Debug (type H for help) - -(CAR 3) -0] -\end{example} -The first group of lines describe what the error was that put us in the -debugger. In this case \code{car} was called on \code{3}. After \code{Restarts:} -is a list of all the ways that we can restart execution after this error. In -this case, the only option is to return to top-level. After printing its -banner, the debugger prints the current frame and the debugger prompt. - -%% -%%\node The Command Loop, Stack Frames, Debugger Introduction, The Debugger -\section{The Command Loop} - -The debugger is an interactive read-eval-print loop much like the normal -top-level, but some symbols are interpreted as debugger commands instead -of being evaluated. A debugger command starts with the symbol name of -the command, possibly followed by some arguments on the same line. Some -commands prompt for additional input. Debugger commands can be -abbreviated by any unambiguous prefix: \code{help} can be typed as -\code{h}, \code{he}, etc. For convenience, some commands have -ambiguous one-letter abbreviations: \code{f} for \code{frame}. - -The package is not significant in debugger commands; any symbol with the -name of a debugger command will work. If you want to show the value of -a variable that happens also to be the name of a debugger command, you -can use the \code{list-locals} command or the \code{debug:var} -function, or you can wrap the variable in a \code{progn} to hide it from -the command loop. - -The debugger prompt is ``\var{frame}\code{]}'', where \var{frame} is the number -of the current frame. Frames are numbered starting from zero at the top (most -recent call), increasing down to the bottom. The current frame is the frame -that commands refer to. The current frame also provides the lexical -environment for evaluation of non-command forms. - -\cpsubindex{evaluation}{debugger} The debugger evaluates forms in the lexical -environment of the functions being debugged. The debugger can only -access variables. You can't \code{go} or \code{return-from} into a -function, and you can't call local functions. Special variable -references are evaluated with their current value (the innermost binding -around the debugger invocation)\dash{}you don't get the value that the -special had in the current frame. \xlref{debug-vars} for more -information on debugger variable access. - -%% -%%\node Stack Frames, Variable Access, The Command Loop, The Debugger -\section{Stack Frames} -\cindex{stack frames} \cpsubindex{frames}{stack} - -A stack frame is the run-time representation of a call to a function; -the frame stores the state that a function needs to remember what it is -doing. Frames have: -\begin{itemize} - -\item -Variables (\pxlref{debug-vars}), which are the values being operated -on, and - -\item -Arguments to the call (which are really just particularly interesting -variables), and - -\item -A current location (\pxlref{source-locations}), which is the place in -the program where the function was running when it stopped to call another -function, or because of an interrupt or error. -\end{itemize} - - -%% -\begin{comment} -* Stack Motion:: -* How Arguments are Printed:: -* Function Names:: -* Funny Frames:: -* Debug Tail Recursion:: -* Unknown Locations and Interrupts:: -\end{comment} - -%%\node Stack Motion, How Arguments are Printed, Stack Frames, Stack Frames -\subsection{Stack Motion} - -These commands move to a new stack frame and print the name of the function -and the values of its arguments in the style of a Lisp function call: -\begin{Lentry} - -\item[\code{up}] -Move up to the next higher frame. More recent function calls are considered -to be higher on the stack. - -\item[\code{down}] -Move down to the next lower frame. - -\item[\code{top}] -Move to the highest frame. - -\item[\code{bottom}] -Move to the lowest frame. - -\item[\code{frame} [\textit{n}]] -Move to the frame with the specified number. Prompts for the number if not -supplied. - -\begin{comment} -\key{S} [\var{function-name} [\var{n}]] - -\item -Search down the stack for function. Prompts for the function name if not -supplied. Searches an optional number of times, but doesn't prompt for -this number; enter it following the function. - -\item[\key{R} [\var{function-name} [\var{n}]]] -Search up the stack for function. Prompts for the function name if not -supplied. Searches an optional number of times, but doesn't prompt for -this number; enter it following the function. -\end{comment} -\end{Lentry} -%% -%%\node How Arguments are Printed, Function Names, Stack Motion, Stack Frames -\subsection{How Arguments are Printed} - -A frame is printed to look like a function call, but with the actual argument -values in the argument positions. So the frame for this call in the source: -\begin{lisp} -(myfun (+ 3 4) 'a) -\end{lisp} -would look like this: -\begin{example} -(MYFUN 7 A) -\end{example} -All keyword and optional arguments are displayed with their actual -values; if the corresponding argument was not supplied, the value will -be the default. So this call: -\begin{lisp} -(subseq "foo" 1) -\end{lisp} -would look like this: -\begin{example} -(SUBSEQ "foo" 1 3) -\end{example} -And this call: -\begin{lisp} -(string-upcase "test case") -\end{lisp} -would look like this: -\begin{example} -(STRING-UPCASE "test case" :START 0 :END NIL) -\end{example} - -The arguments to a function call are displayed by accessing the argument -variables. Although those variables are initialized to the actual argument -values, they can be set inside the function; in this case the new value will be -displayed. - -\code{\amprest} arguments are handled somewhat differently. The value of -the rest argument variable is displayed as the spread-out arguments to -the call, so: -\begin{lisp} -(format t "~A is a ~A." "This" 'test) -\end{lisp} -would look like this: -\begin{example} -(FORMAT T "~A is a ~A." "This" 'TEST) -\end{example} -Rest arguments cause an exception to the normal display of keyword -arguments in functions that have both \code{\amprest} and \code{\&key} -arguments. In this case, the keyword argument variables are not -displayed at all; the rest arg is displayed instead. So for these -functions, only the keywords actually supplied will be shown, and the -values displayed will be the argument values, not values of the -(possibly modified) variables. - -If the variable for an argument is never referenced by the function, it will be -deleted. The variable value is then unavailable, so the debugger prints -\code{} instead of the value. Similarly, if for any of a number of -reasons (described in more detail in section \ref{debug-vars}) the value of the -variable is unavailable or not known to be available, then -\code{} will be printed instead of the argument value. - -Printing of argument values is controlled by \code{*debug-print-level*} and -\varref{debug-print-length}. - -%% -%%\node Function Names, Funny Frames, How Arguments are Printed, Stack Frames -\subsection{Function Names} -\cpsubindex{function}{names} -\cpsubindex{names}{function} - -If a function is defined by \code{defun}, \code{labels}, or \code{flet}, then the -debugger will print the actual function name after the open parenthesis, like: -\begin{example} -(STRING-UPCASE "test case" :START 0 :END NIL) -((SETF AREF) \#\back{a} "for" 1) -\end{example} -Otherwise, the function name is a string, and will be printed in quotes: -\begin{example} -("DEFUN MYFUN" BAR) -("DEFMACRO DO" (DO ((I 0 (1+ I))) ((= I 13))) NIL) -("SETQ *GC-NOTIFY-BEFORE*") -\end{example} -This string name is derived from the \w{\code{def}\var{mumble}} form that encloses -or expanded into the lambda, or the outermost enclosing form if there is no -\w{\code{def}\var{mumble}}. - -%% -%%\node Funny Frames, Debug Tail Recursion, Function Names, Stack Frames -\subsection{Funny Frames} -\cindex{external entry points} -\cpsubindex{entry points}{external} -\cpsubindex{block compilation}{debugger implications} -\cpsubindex{external}{stack frame kind} -\cpsubindex{optional}{stack frame kind} -\cpsubindex{cleanup}{stack frame kind} - -Sometimes the evaluator introduces new functions that are used to implement a -user function, but are not directly specified in the source. The main place -this is done is for checking argument type and syntax. Usually these functions -do their thing and then go away, and thus are not seen on the stack in the -debugger. But when you get some sort of error during lambda-list processing, -you end up in the debugger on one of these funny frames. - -These funny frames are flagged by printing ``\code{[}\var{keyword}\code{]}'' after the -parentheses. For example, this call: -\begin{lisp} -(car 'a 'b) -\end{lisp} -will look like this: -\begin{example} -(CAR 2 A) [:EXTERNAL] -\end{example} -And this call: -\begin{lisp} -(string-upcase "test case" :end) -\end{lisp} -would look like this: -\begin{example} -("DEFUN STRING-UPCASE" "test case" 335544424 1) [:OPTIONAL] -\end{example} - -As you can see, these frames have only a vague resemblance to the original -call. Fortunately, the error message displayed when you enter the debugger -will usually tell you what problem is (in these cases, too many arguments -and odd keyword arguments.) Also, if you go down the stack to the frame for -the calling function, you can display the original source (\pxlref{source-locations}.) - -With recursive or block compiled functions (\pxlref{block-compilation}), an \kwd{EXTERNAL} frame may appear before the frame -representing the first call to the recursive function or entry to the compiled -block. This is a consequence of the way the compiler does block compilation: -there is nothing odd with your program. You will also see \kwd{CLEANUP} frames -during the execution of \code{unwind-protect} cleanup code. Note that inline -expansion and open-coding affect what frames are present in the debugger, see -sections \ref{debugger-policy} and \ref{open-coding}. - -%% -%%\node Debug Tail Recursion, Unknown Locations and Interrupts, Funny Frames, Stack Frames -\subsection{Debug Tail Recursion} -\label{debug-tail-recursion} -\cindex{tail recursion} -\cpsubindex{recursion}{tail} - -Both the compiler and the interpreter are ``properly tail recursive.'' If a -function call is in a tail-recursive position, the stack frame will be -deallocated \i{at the time of the call}, rather than after the call returns. -Consider this backtrace: -\begin{example} -(BAR ...) -(FOO ...) -\end{example} -Because of tail recursion, it is not necessarily the case that -\code{FOO} directly called \code{BAR}. It may be that \code{FOO} called -some other function \code{FOO2} which then called \code{BAR} -tail-recursively, as in this example: -\begin{example} -(defun foo () - ... - (foo2 ...) - ...) - -(defun foo2 (...) - ... - (bar ...)) - -(defun bar (...) - ...) -\end{example} - -Usually the elimination of tail-recursive frames makes debugging more -pleasant, since these frames are mostly uninformative. If there is any -doubt about how one function called another, it can usually be -eliminated by finding the source location in the calling frame (section -\ref{source-locations}.) - -For a more thorough discussion of tail recursion, \pxlref{tail-recursion}. - -%% -%%\node Unknown Locations and Interrupts, , Debug Tail Recursion, Stack Frames -\subsection{Unknown Locations and Interrupts} -\label{unknown-locations} -\cindex{unknown code locations} -\cpsubindex{locations}{unknown} -\cindex{interrupts} -\cpsubindex{errors}{run-time} - -The debugger operates using special debugging information attached to -the compiled code. This debug information tells the debugger what it -needs to know about the locations in the code where the debugger can be -invoked. If the debugger somehow encounters a location not described in -the debug information, then it is said to be \var{unknown}. If the code -location for a frame is unknown, then some variables may be -inaccessible, and the source location cannot be precisely displayed. - -There are three reasons why a code location could be unknown: -\begin{itemize} - -\item -There is inadequate debug information due to the value of the \code{debug} -optimization quality. \xlref{debugger-policy}. - -\item -The debugger was entered because of an interrupt such as \code{$\hat{ }C$}. - -\item -A hardware error such as ``\code{bus error}'' occurred in code that was -compiled unsafely due to the value of the \code{safety} optimization -quality. \xlref{optimize-declaration}. -\end{itemize} - -In the last two cases, the values of argument variables are accessible, -but may be incorrect. \xlref{debug-var-validity} for more details on -when variable values are accessible. - -It is possible for an interrupt to happen when a function call or return is in -progress. The debugger may then flame out with some obscure error or insist -that the bottom of the stack has been reached, when the real problem is that -the current stack frame can't be located. If this happens, return from the -interrupt and try again. - -When running interpreted code, all locations should be known. However, -an interrupt might catch some subfunction of the interpreter at an -unknown location. In this case, you should be able to go up the stack a -frame or two and reach an interpreted frame which can be debugged. - -%% -%%\node Variable Access, Source Location Printing, Stack Frames, The Debugger -\section{Variable Access} -\label{debug-vars} -\cpsubindex{variables}{debugger access} -\cindex{debug variables} - -There are three ways to access the current frame's local variables in the -debugger. The simplest is to type the variable's name into the debugger's -read-eval-print loop. The debugger will evaluate the variable reference as -though it had appeared inside that frame. - -The debugger doesn't really understand lexical scoping; it has just one -namespace for all the variables in a function. If a symbol is the name of -multiple variables in the same function, then the reference appears ambiguous, -even though lexical scoping specifies which value is visible at any given -source location. If the scopes of the two variables are not nested, then the -debugger can resolve the ambiguity by observing that only one variable is -accessible. - -When there are ambiguous variables, the evaluator assigns each one a -small integer identifier. The \code{debug:var} function and the -\code{list-locals} command use this identifier to distinguish between -ambiguous variables: -\begin{Lentry} - -\item[\code{list-locals} \mopt{\var{prefix}}]%%\hfill\\ -This command prints the name and value of all variables in the current -frame whose name has the specified \var{prefix}. \var{prefix} may be a -string or a symbol. If no \var{prefix} is given, then all available -variables are printed. If a variable has a potentially ambiguous name, -then the name is printed with a ``\code{\#}\var{identifier}'' suffix, where -\var{identifier} is the small integer used to make the name unique. -\end{Lentry} - -\begin{defun}{debug:}{var}{\args{\var{name} \ampoptional{} \var{identifier}}} - - This function returns the value of the variable in the current frame - with the specified \var{name}. If supplied, \var{identifier} - determines which value to return when there are ambiguous variables. - - When \var{name} is a symbol, it is interpreted as the symbol name of - the variable, i.e. the package is significant. If \var{name} is an - uninterned symbol (gensym), then return the value of the uninterned - variable with the same name. If \var{name} is a string, - \code{debug:var} interprets it as the prefix of a variable name, and - must unambiguously complete to the name of a valid variable. - - This function is useful mainly for accessing the value of uninterned - or ambiguous variables, since most variables can be evaluated - directly. -\end{defun} - -%% -\begin{comment} -* Variable Value Availability:: -* Note On Lexical Variable Access:: -\end{comment} - -%%\node Variable Value Availability, Note On Lexical Variable Access, Variable Access, Variable Access -\subsection{Variable Value Availability} -\label{debug-var-validity} -\cindex{availability of debug variables} -\cindex{validity of debug variables} -\cindex{debug optimization quality} - -The value of a variable may be unavailable to the debugger in portions of the -program where \clisp{} says that the variable is defined. If a variable value is -not available, the debugger will not let you read or write that variable. With -one exception, the debugger will never display an incorrect value for a -variable. Rather than displaying incorrect values, the debugger tells you the -value is unavailable. - -The one exception is this: if you interrupt (e.g., with \code{$\hat{ }C$}) or if there is -an unexpected hardware error such as ``\code{bus error}'' (which should only happen -in unsafe code), then the values displayed for arguments to the interrupted -frame might be incorrect.\footnote{Since the location of an interrupt or hardware -error will always be an unknown location (\pxlref{unknown-locations}), -non-argument variable values will never be available in the interrupted frame.} -This exception applies only to the interrupted frame: any frame farther down -the stack will be fine. - -The value of a variable may be unavailable for these reasons: -\begin{itemize} - -\item -The value of the \code{debug} optimization quality may have omitted debug -information needed to determine whether the variable is available. -Unless a variable is an argument, its value will only be available when -\code{debug} is at least \code{2}. - -\item -The compiler did lifetime analysis and determined that the value was no longer -needed, even though its scope had not been exited. Lifetime analysis is -inhibited when the \code{debug} optimization quality is \code{3}. - -\item -The variable's name is an uninterned symbol (gensym). To save space, the -compiler only dumps debug information about uninterned variables when the -\code{debug} optimization quality is \code{3}. - -\item -The frame's location is unknown (\pxlref{unknown-locations}) because -the debugger was entered due to an interrupt or unexpected hardware error. -Under these conditions the values of arguments will be available, but might be -incorrect. This is the exception above. - -\item -The variable was optimized out of existence. Variables with no reads are -always optimized away, even in the interpreter. The degree to which the -compiler deletes variables will depend on the value of the \code{compile-speed} -optimization quality, but most source-level optimizations are done under all -compilation policies. -\end{itemize} - - -Since it is especially useful to be able to get the arguments to a function, -argument variables are treated specially when the \code{speed} optimization -quality is less than \code{3} and the \code{debug} quality is at least \code{1}. -With this compilation policy, the values of argument variables are almost -always available everywhere in the function, even at unknown locations. For -non-argument variables, \code{debug} must be at least \code{2} for values to be -available, and even then, values are only available at known locations. - -%% -%%\node Note On Lexical Variable Access, , Variable Value Availability, Variable Access -\subsection{Note On Lexical Variable Access} -\cpsubindex{evaluation}{debugger} - -When the debugger command loop establishes variable bindings for available -variables, these variable bindings have lexical scope and dynamic -extent.\footnote{The variable bindings are actually created using the \clisp{} -\code{symbol-macro-let} special form.} You can close over them, but such closures -can't be used as upward funargs. - -You can also set local variables using \code{setq}, but if the variable was closed -over in the original source and never set, then setting the variable in the -debugger may not change the value in all the functions the variable is defined -in. Another risk of setting variables is that you may assign a value of a type -that the compiler proved the variable could never take on. This may result in -bad things happening. - -%% -%%\node Source Location Printing, Compiler Policy Control, Variable Access, The Debugger -\section{Source Location Printing} -\label{source-locations} -\cpsubindex{source location printing}{debugger} - -One of CMU \clisp{}'s unique capabilities is source level debugging of compiled -code. These commands display the source location for the current frame: -\begin{Lentry} - -\item[\code{source} \mopt{\var{context}}]%%\hfill\\ -This command displays the file that the current frame's function was defined -from (if it was defined from a file), and then the source form responsible for -generating the code that the current frame was executing. If \var{context} is -specified, then it is an integer specifying the number of enclosing levels of -list structure to print. - -\item[\code{vsource} \mopt{\var{context}}]%%\hfill\\ -This command is identical to \code{source}, except that it uses the -global values of \code{*print-level*} and \code{*print-length*} instead -of the debugger printing control variables \code{*debug-print-level*} -and \code{*debug-print-length*}. -\end{Lentry} - -The source form for a location in the code is the innermost list present -in the original source that encloses the form responsible for generating -that code. If the actual source form is not a list, then some enclosing -list will be printed. For example, if the source form was a reference -to the variable \code{*some-random-special*}, then the innermost -enclosing evaluated form will be printed. Here are some possible -enclosing forms: -\begin{example} -(let ((a *some-random-special*)) - ...) - -(+ *some-random-special* ...) -\end{example} - -If the code at a location was generated from the expansion of a macro or a -source-level compiler optimization, then the form in the original source that -expanded into that code will be printed. Suppose the file -\file{/usr/me/mystuff.lisp} looked like this: -\begin{example} -(defmacro mymac () - '(myfun)) - -(defun foo () - (mymac) - ...) -\end{example} -If \code{foo} has called \code{myfun}, and is waiting for it to return, then the -\code{source} command would print: -\begin{example} -; File: /usr/me/mystuff.lisp - -(MYMAC) -\end{example} -Note that the macro use was printed, not the actual function call form, -\code{(myfun)}. - -If enclosing source is printed by giving an argument to \code{source} or -\code{vsource}, then the actual source form is marked by wrapping it in a list -whose first element is \code{\#:***HERE***}. In the previous example, -\w{\code{source 1}} would print: -\begin{example} -; File: /usr/me/mystuff.lisp - -(DEFUN FOO () - (#:***HERE*** - (MYMAC)) - ...) -\end{example} - -%% -\begin{comment} -* How the Source is Found:: -* Source Location Availability:: -\end{comment} - -%%\node How the Source is Found, Source Location Availability, Source Location Printing, Source Location Printing -\subsection{How the Source is Found} - -If the code was defined from \llisp{} by \code{compile} or -\code{eval}, then the source can always be reliably located. If the -code was defined from a \code{fasl} file created by -\findexed{compile-file}, then the debugger gets the source forms it -prints by reading them from the original source file. This is a -potential problem, since the source file might have moved or changed -since the time it was compiled. - -The source file is opened using the \code{truename} of the source file -pathname originally given to the compiler. This is an absolute pathname -with all logical names and symbolic links expanded. If the file can't -be located using this name, then the debugger gives up and signals an -error. - -If the source file can be found, but has been modified since the time it was -compiled, the debugger prints this warning: -\begin{example} -; File has been modified since compilation: -; \var{filename} -; Using form offset instead of character position. -\end{example} -where \var{filename} is the name of the source file. It then proceeds using a -robust but not foolproof heuristic for locating the source. This heuristic -works if: -\begin{itemize} - -\item -No top-level forms before the top-level form containing the source have been -added or deleted, and - -\item -The top-level form containing the source has not been modified much. (More -precisely, none of the list forms beginning before the source form have been -added or deleted.) -\end{itemize} - -If the heuristic doesn't work, the displayed source will be wrong, but will -probably be near the actual source. If the ``shape'' of the top-level form in -the source file is too different from the original form, then an error will be -signaled. When the heuristic is used, the the source location commands are -noticeably slowed. - -Source location printing can also be confused if (after the source was -compiled) a read-macro you used in the code was redefined to expand into -something different, or if a read-macro ever returns the same \code{eq} -list twice. If you don't define read macros and don't use \code{\#\#} in -perverted ways, you don't need to worry about this. - -%% -%%\node Source Location Availability, , How the Source is Found, Source Location Printing -\subsection{Source Location Availability} - -\cindex{debug optimization quality} -Source location information is only available when the \code{debug} -optimization quality is at least \code{2}. If source location information is -unavailable, the source commands will give an error message. - -If source location information is available, but the source location is -unknown because of an interrupt or unexpected hardware error -(\pxlref{unknown-locations}), then the command will print: -\begin{example} -Unknown location: using block start. -\end{example} -and then proceed to print the source location for the start of the \i{basic -block} enclosing the code location. \cpsubindex{block}{basic} -\cpsubindex{block}{start location} -It's a bit complicated to explain exactly what a basic block is, but -here are some properties of the block start location: -\begin{itemize} - -\item The block start location may be the same as the true location. - -\item The block start location will never be later in the the - program's flow of control than the true location. - -\item No conditional control structures (such as \code{if}, - \code{cond}, \code{or}) will intervene between the block start and - the true location (but note that some conditionals present in the - original source could be optimized away.) Function calls \i{do not} - end basic blocks. - -\item The head of a loop will be the start of a block. - -\item The programming language concept of ``block structure'' and the - \clisp{} \code{block} special form are totally unrelated to the - compiler's basic block. -\end{itemize} - -In other words, the true location lies between the printed location and the -next conditional (but watch out because the compiler may have changed the -program on you.) - -%% -%%\node Compiler Policy Control, Exiting Commands, Source Location Printing, The Debugger -\section{Compiler Policy Control} -\label{debugger-policy} -\cpsubindex{policy}{debugger} -\cindex{debug optimization quality} -\cindex{optimize declaration} - -The compilation policy specified by \code{optimize} declarations affects the -behavior seen in the debugger. The \code{debug} quality directly affects the -debugger by controlling the amount of debugger information dumped. Other -optimization qualities have indirect but observable effects due to changes in -the way compilation is done. - -Unlike the other optimization qualities (which are compared in relative value -to evaluate tradeoffs), the \code{debug} optimization quality is directly -translated to a level of debug information. This absolute interpretation -allows the user to count on a particular amount of debug information being -available even when the values of the other qualities are changed during -compilation. These are the levels of debug information that correspond to the -values of the \code{debug} quality: -\begin{Lentry} - -\item[\code{0}] -Only the function name and enough information to allow the stack to -be parsed. - -\item[\code{\w{$>$ 0}}] -Any level greater than \code{0} gives level \code{0} plus all -argument variables. Values will only be accessible if the argument -variable is never set and -\code{speed} is not \code{3}. \cmucl{} allows any real value for optimization -qualities. It may be useful to specify \code{0.5} to get backtrace argument -display without argument documentation. - -\item[\code{1}] Level \code{1} provides argument documentation -(printed arglists) and derived argument/result type information. -This makes \findexed{describe} more informative, and allows the -compiler to do compile-time argument count and type checking for any -calls compiled at run-time. - -\item[\code{2}] -Level \code{1} plus all interned local variables, source location -information, and lifetime information that tells the debugger when arguments -are available (even when \code{speed} is \code{3} or the argument is set.) This is -the default. - -\item[\code{3}] -Level \code{2} plus all uninterned variables. In addition, lifetime -analysis is disabled (even when \code{speed} is \code{3}), ensuring that all variable -values are available at any known location within the scope of the binding. -This has a speed penalty in addition to the obvious space penalty. -\end{Lentry} - -As you can see, if the \code{speed} quality is \code{3}, debugger performance is -degraded. This effect comes from the elimination of argument variable -special-casing (\pxlref{debug-var-validity}.) Some degree of -speed/debuggability tradeoff is unavoidable, but the effect is not too drastic -when \code{debug} is at least \code{2}. - -\cindex{inline expansion} -\cindex{semi-inline expansion} -In addition to \code{inline} and \code{notinline} declarations, the relative values -of the \code{speed} and \code{space} qualities also change whether functions are -inline expanded (\pxlref{inline-expansion}.) If a function is inline -expanded, then there will be no frame to represent the call, and the arguments -will be treated like any other local variable. Functions may also be -``semi-inline'', in which case there is a frame to represent the call, but the -call is to an optimized local version of the function, not to the original -function. - -%% -%%\node Exiting Commands, Information Commands, Compiler Policy Control, The Debugger -\section{Exiting Commands} - -These commands get you out of the debugger. - -\begin{Lentry} - -\item[\code{quit}] -Throw to top level. - -\item[\code{restart} \mopt{\var{n}}]%%\hfill\\ -Invokes the \var{n}th restart case as displayed by the \code{error} -command. If \var{n} is not specified, the available restart cases are -reported. - -\item[\code{go}] -Calls \code{continue} on the condition given to \code{debug}. If there is no -restart case named \var{continue}, then an error is signaled. - -\item[\code{abort}] -Calls \code{abort} on the condition given to \code{debug}. This is -useful for popping debug command loop levels or aborting to top level, -as the case may be. - -\begin{comment} -(\code{debug:debug-return} \var{expression} \mopt{\var{frame}}) - -\item -From the current or specified frame, return the result of evaluating -expression. If multiple values are expected, then this function should be -called for multiple values. -\end{comment} -\end{Lentry} - -%% -%%\node Information Commands, Breakpoint Commands, Exiting Commands, The Debugger -\section{Information Commands} - -Most of these commands print information about the current frame or -function, but a few show general information. - -\begin{Lentry} - -\item[\code{help}, \code{?}] -Displays a synopsis of debugger commands. - -\item[\code{describe}] -Calls \code{describe} on the current function, displays number of local -variables, and indicates whether the function is compiled or interpreted. - -\item[\code{print}] -Displays the current function call as it would be displayed by moving to -this frame. - -\item[\code{vprint} (or \code{pp}) \mopt{\var{verbosity}}]%%\hfill\\ -Displays the current function call using \code{*print-level*} and -\code{*print-length*} instead of \code{*debug-print-level*} and -\code{*debug-print-length*}. \var{verbosity} is a small integer -(default 2) that controls other dimensions of verbosity. - -\item[\code{error}] -Prints the condition given to \code{invoke-debugger} and the active -proceed cases. - -\item[\code{backtrace} \mopt{\var{n}}]\hfill\\ -Displays all the frames from the current to the bottom. Only shows -\var{n} frames if specified. The printing is controlled by -\code{*debug-print-level*} and \code{*debug-print-length*}. - -\begin{comment} -(\code{debug:debug-function} \mopt{\var{n}}) - -\item -Returns the function from the current or specified frame. - -\item[(\code{debug:function-name} \mopt{\var{n}])] -Returns the function name from the current or specified frame. - -\item[(\code{debug:pc} \mopt{\var{frame}})] -Returns the index of the instruction for the function in the current or -specified frame. This is useful in conjunction with \code{disassemble}. -The pc returned points to the instruction after the one that was fatal. -\end{comment} -\end{Lentry} - -%% -%%\node Breakpoint Commands, Function Tracing, Information Commands, The Debugger -\section{Breakpoint Commands} - -\cmucl{} supports setting of breakpoints inside compiled functions and -stepping of compiled code. Breakpoints can only be set at at known -locations (\pxlref{unknown-locations}), so these commands are largely -useless unless the \code{debug} optimize quality is at least \code{2} -(\pxlref{debugger-policy}). These commands manipulate breakpoints: -\begin{Lentry} -\item[\code{breakpoint} \var{location} \mstar{\var{option} \var{value}}] -%%\hfill\\ -Set a breakpoint in some function. \var{location} may be an integer -code location number (as displayed by \code{list-locations}) or a -keyword. The keyword can be used to indicate setting a breakpoint at -the function start (\kwd{start}, \kwd{s}) or function end -(\kwd{end}, \kwd{e}). The \code{breakpoint} command has -\kwd{condition}, \kwd{break}, \kwd{print} and \kwd{function} -options which work similarly to the \code{trace} options. - -\item[\code{list-locations} (or \code{ll}) \mopt{\var{function}}]%%\hfill\\ -List all the code locations in the current frame's function, or in -\var{function} if it is supplied. The display format is the code -location number, a colon and then the source form for that location: -\begin{example} -3: (1- N) -\end{example} -If consecutive locations have the same source, then a numeric range like -\code{3-5:} will be printed. For example, a default function call has a -known location both immediately before and after the call, which would -result in two code locations with the same source. The listed function -becomes the new default function for breakpoint setting (via the -\code{breakpoint}) command. - -\item[\code{list-breakpoints} (or \code{lb})]%%\hfill\\ -List all currently active breakpoints with their breakpoint number. - -\item[\code{delete-breakpoint} (or \code{db}) \mopt{\var{number}}]%%\hfill\\ -Delete a breakpoint specified by its breakpoint number. If no number is -specified, delete all breakpoints. - -\item[\code{step}]%%\hfill\\ -Step to the next possible breakpoint location in the current function. -This always steps over function calls, instead of stepping into them -\end{Lentry} - -\begin{comment} -* Breakpoint Example:: -\end{comment} - -%%\node Breakpoint Example, , Breakpoint Commands, Breakpoint Commands -\subsection{Breakpoint Example} - -Consider this definition of the factorial function: -\begin{lisp} -(defun ! (n) - (if (zerop n) - 1 - (* n (! (1- n))))) -\end{lisp} -This debugger session demonstrates the use of breakpoints: -\begin{example} -common-lisp-user> (break) ; Invoke debugger - -Break - -Restarts: - 0: [CONTINUE] Return from BREAK. - 1: [ABORT ] Return to Top-Level. - -Debug (type H for help) - -(INTERACTIVE-EVAL (BREAK)) -0] ll #'! -0: #'(LAMBDA (N) (BLOCK ! (IF # 1 #))) -1: (ZEROP N) -2: (* N (! (1- N))) -3: (1- N) -4: (! (1- N)) -5: (* N (! (1- N))) -6: #'(LAMBDA (N) (BLOCK ! (IF # 1 #))) -0] br 2 -(* N (! (1- N))) -1: 2 in ! -Added. -0] q - -common-lisp-user> (! 10) ; Call the function - -*Breakpoint hit* - -Restarts: - 0: [CONTINUE] Return from BREAK. - 1: [ABORT ] Return to Top-Level. - -Debug (type H for help) - -(! 10) ; We are now in first call (arg 10) before the multiply -Source: (* N (! (1- N))) -3] st - -*Step* - -(! 10) ; We have finished evaluation of (1- n) -Source: (1- N) -3] st - -*Breakpoint hit* - -Restarts: - 0: [CONTINUE] Return from BREAK. - 1: [ABORT ] Return to Top-Level. - -Debug (type H for help) - -(! 9) ; We hit the breakpoint in the recursive call -Source: (* N (! (1- N))) -3] -\end{example} - - - -%% -%%\node Function Tracing, Specials, Breakpoint Commands, The Debugger -\section{Function Tracing} -\cindex{tracing} -\cpsubindex{function}{tracing} - -The tracer causes selected functions to print their arguments and -their results whenever they are called. Options allow conditional -printing of the trace information and conditional breakpoints on -function entry or exit. - -\begin{defmac}{}{trace}{% - \args{\mstar{option global-value} \mstar{name \mstar{option - value}}}} - - \code{trace} is a debugging tool that prints information when - specified functions are called. In its simplest form: - \begin{example} - (trace \var{name-1} \var{name-2} ...) - \end{example} - \code{trace} causes a printout on \vindexed{trace-output} each time - that one of the named functions is entered or returns (the - \var{names} are not evaluated.) Trace output is indented according - to the number of pending traced calls, and this trace depth is - printed at the beginning of each line of output. Printing verbosity - of arguments and return values is controlled by - \vindexed{debug-print-level} and \vindexed{debug-print-length}. - - If no \var{names} or \var{options} are are given, \code{trace} - returns the list of all currently traced functions, - \code{*traced-function-list*}. - - Trace options can cause the normal printout to be suppressed, or - cause extra information to be printed. Each option is a pair of an - option keyword and a value form. Options may be interspersed with - function names. Options only affect tracing of the function whose - name they appear immediately after. Global options are specified - before the first name, and affect all functions traced by a given - use of \code{trace}. If an already traced function is traced again, - any new options replace the old options. The following options are - defined: - \begin{Lentry} - \item[\kwd{condition} \var{form}, \kwd{condition-after} \var{form}, - \kwd{condition-all} \var{form}] If \kwd{condition} is specified, - then \code{trace} does nothing unless \var{form} evaluates to true - at the time of the call. \kwd{condition-after} is similar, but - suppresses the initial printout, and is tested when the function - returns. \kwd{condition-all} tries both before and after. - - \item[\kwd{wherein} \var{names}] If specified, \var{names} is a - function name or list of names. \code{trace} does nothing unless - a call to one of those functions encloses the call to this - function (i.e. it would appear in a backtrace.) Anonymous - functions have string names like \code{"DEFUN FOO"}. - - \item[\kwd{break} \var{form}, \kwd{break-after} \var{form}, - \kwd{break-all} \var{form}] If specified, and \var{form} evaluates - to true, then the debugger is invoked at the start of the - function, at the end of the function, or both, according to the - respective option. - - \item[\kwd{print} \var{form}, \kwd{print-after} \var{form}, - \kwd{print-all} \var{form}] In addition to the usual printout, the - result of evaluating \var{form} is printed at the start of the - function, at the end of the function, or both, according to the - respective option. Multiple print options cause multiple values - to be printed. - - \item[\kwd{function} \var{function-form}] This is a not really an - option, but rather another way of specifying what function to - trace. The \var{function-form} is evaluated immediately, and the - resulting function is traced. - - \item[\kwd{encapsulate \mgroup{:default | t | nil}}] In \cmucl, - tracing can be done either by temporarily redefining the function - name (encapsulation), or using breakpoints. When breakpoints are - used, the function object itself is destructively modified to - cause the tracing action. The advantage of using breakpoints is - that tracing works even when the function is anonymously called - via \code{funcall}. - - When \kwd{encapsulate} is true, tracing is done via encapsulation. - \kwd{default} is the default, and means to use encapsulation for - interpreted functions and funcallable instances, breakpoints - otherwise. When encapsulation is used, forms are {\it not} - evaluated in the function's lexical environment, but - \code{debug:arg} can still be used. - \end{Lentry} - - \kwd{condition}, \kwd{break} and \kwd{print} forms are evaluated in - the lexical environment of the called function; \code{debug:var} and - \code{debug:arg} can be used. The \code{-after} and \code{-all} - forms are evaluated in the null environment. -\end{defmac} - -\begin{defmac}{}{untrace}{ \args{\amprest{} \var{function-names}}} - - This macro turns off tracing for the specified functions, and - removes their names from \code{*traced-function-list*}. If no - \var{function-names} are given, then all currently traced functions - are untraced. -\end{defmac} - -\begin{defvar}{extensions:}{traced-function-list} - - A list of function names maintained and used by \code{trace}, - \code{untrace}, and \code{untrace-all}. This list should contain - the names of all functions currently being traced. -\end{defvar} - -\begin{defvar}{extensions:}{max-trace-indentation} - - The maximum number of spaces which should be used to indent trace - printout. This variable is initially set to 40. -\end{defvar} - -\begin{comment} -* Encapsulation Functions:: -\end{comment} - -%%\node Encapsulation Functions, , Function Tracing, Function Tracing -\subsection{Encapsulation Functions} -\cindex{encapsulation} -\cindex{advising} - -The encapsulation functions provide a mechanism for intercepting the -arguments and results of a function. \code{encapsulate} changes the -function definition of a symbol, and saves it so that it can be -restored later. The new definition normally calls the original -definition. The \clisp{} \findexed{fdefinition} function always returns -the original definition, stripping off any encapsulation. - -The original definition of the symbol can be restored at any time by -the \code{unencapsulate} function. \code{encapsulate} and \code{unencapsulate} -allow a symbol to be multiply encapsulated in such a way that different -encapsulations can be completely transparent to each other. - -Each encapsulation has a type which may be an arbitrary lisp object. -If a symbol has several encapsulations of different types, then any -one of them can be removed without affecting more recent ones. -A symbol may have more than one encapsulation of the same type, but -only the most recent one can be undone. - -\begin{defun}{extensions:}{encapsulate}{% - \args{\var{symbol} \var{type} \var{body}}} - - Saves the current definition of \var{symbol}, and replaces it with a - function which returns the result of evaluating the form, - \var{body}. \var{Type} is an arbitrary lisp object which is the - type of encapsulation. - - When the new function is called, the following variables are bound - for the evaluation of \var{body}: - \begin{Lentry} - - \item[\code{extensions:argument-list}] A list of the arguments to - the function. - - \item[\code{extensions:basic-definition}] The unencapsulated - definition of the function. - \end{Lentry} - The unencapsulated definition may be called with the original - arguments by including the form - \begin{lisp} - (apply extensions:basic-definition extensions:argument-list) - \end{lisp} - - \code{encapsulate} always returns \var{symbol}. -\end{defun} - -\begin{defun}{extensions:}{unencapsulate}{\args{\var{symbol} \var{type}}} - - Undoes \var{symbol}'s most recent encapsulation of type \var{type}. - \var{Type} is compared with \code{eq}. Encapsulations of other - types are left in place. -\end{defun} - -\begin{defun}{extensions:}{encapsulated-p}{% - \args{\var{symbol} \var{type}}} - - Returns \true{} if \var{symbol} has an encapsulation of type - \var{type}. Returns \nil{} otherwise. \var{type} is compared with - \code{eq}. -\end{defun} - -%% -\begin{comment} -section{The Single Stepper} - -\begin{defmac}{}{step}{ \args{\var{form}}} - - Evaluates form with single stepping enabled or if \var{form} is - \code{T}, enables stepping until explicitly disabled. Stepping can - be disabled by quitting to the lisp top level, or by evaluating the - form \w{\code{(step ())}}. - - While stepping is enabled, every call to eval will prompt the user - for a single character command. The prompt is the form which is - about to be \code{eval}ed. It is printed with \code{*print-level*} - and \code{*print-length*} bound to \code{*step-print-level*} and - \code{*step-print-length*}. All interaction is done through the - stream \code{*query-io*}. Because of this, the stepper can not be - used in Hemlock eval mode. When connected to a slave Lisp, the - stepper can be used from Hemlock. - - The commands are: - \begin{Lentry} - - \item[\key{n} (next)] Evaluate the expression with stepping still - enabled. - - \item[\key{s} (skip)] Evaluate the expression with stepping - disabled. - - \item[\key{q} (quit)] Evaluate the expression, but disable all - further stepping inside the current call to \code{step}. - - \item[\key{p} (print)] Print current form. (does not use - \code{*step-print-level*} or \code{*step-print-length*}.) - - \item[\key{b} (break)] Enter break loop, and then prompt for the - command again when the break loop returns. - - \item[\key{e} (eval)] Prompt for and evaluate an arbitrary - expression. The expression is evaluated with stepping disabled. - - \item[\key{?} (help)] Prints a brief list of the commands. - - \item[\key{r} (return)] Prompt for an arbitrary value to return as - result of the current call to eval. - - \item[\key{g}] Throw to top level. - \end{Lentry} -\end{defmac} - -\begin{defvar}{extensions:}{step-print-level} - \defvarx[extensions:]{step-print-length} - - \code{*print-level*} and \code{*print-length*} are bound to these - values while printing the current form. \code{*step-print-level*} - and \code{*step-print-length*} are initially bound to 4 and 5, - respectively. -\end{defvar} - -\begin{defvar}{extensions:}{max-step-indentation} - - Step indents the prompts to highlight the nesting of the evaluation. - This variable contains the maximum number of spaces to use for - indenting. Initially set to 40. -\end{defvar} - -\end{comment} - -%% -%%\node Specials, , Function Tracing, The Debugger -\section{Specials} -These are the special variables that control the debugger action. - -\begin{changebar} -\begin{defvar}{debug:}{debug-print-level} - \defvarx[debug:]{debug-print-length} - - \code{*print-level*} and \code{*print-length*} are bound to these - values during the execution of some debug commands. When evaluating - arbitrary expressions in the debugger, the normal values of - \code{*print-level*} and \code{*print-length*} are in effect. These - variables are initially set to 3 and 5, respectively. -\end{defvar} -\end{changebar} - -%% -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/compiler.ms} - - -%%\node The Compiler, Advanced Compiler Use and Efficiency Hints, The Debugger, Top -\chapter{The Compiler} \hide{ -*- Dictionary: cmu-user -*-} - -\begin{comment} -* Compiler Introduction:: -* Calling the Compiler:: -* Compilation Units:: -* Interpreting Error Messages:: -* Types in Python:: -* Getting Existing Programs to Run:: -* Compiler Policy:: -* Open Coding and Inline Expansion:: -\end{comment} - -%%\node Compiler Introduction, Calling the Compiler, The Compiler, The Compiler -\section{Compiler Introduction} - -This chapter contains information about the compiler that every \cmucl{} user -should be familiar with. Chapter \ref{advanced-compiler} goes into greater -depth, describing ways to use more advanced features. - -The \cmucl{} compiler (also known as \Python{}) has many features -that are seldom or never supported by conventional \llisp{} -compilers: -\begin{itemize} - -\item Source level debugging of compiled code (see chapter - \ref{debugger}.) - -\item Type error compiler warnings for type errors detectable at - compile time. - -\item Compiler error messages that provide a good indication of where - the error appeared in the source. - -\item Full run-time checking of all potential type errors, with - optimization of type checks to minimize the cost. - -\item Scheme-like features such as proper tail recursion and extensive - source-level optimization. - -\item Advanced tuning and optimization features such as comprehensive - efficiency notes, flow analysis, and untagged number representations - (see chapter \ref{advanced-compiler}.) -\end{itemize} - - -%% -%%\node Calling the Compiler, Compilation Units, Compiler Introduction, The Compiler -\section{Calling the Compiler} -\cindex{compiling} -Functions may be compiled using \code{compile}, \code{compile-file}, or -\code{compile-from-stream}. - -\begin{defun}{}{compile}{ \args{\var{name} \ampoptional{} \var{definition}}} - - This function compiles the function whose name is \var{name}. If - \var{name} is \false, the compiled function object is returned. If - \var{definition} is supplied, it should be a lambda expression that - is to be compiled and then placed in the function cell of - \var{name}. As per the proposed X3J13 cleanup - ``compile-argument-problems'', \var{definition} may also be an - interpreted function. - - The return values are as per the proposed X3J13 cleanup - ``compiler-diagnostics''. The first value is the function name or - function object. The second value is \false{} if no compiler - diagnostics were issued, and \true{} otherwise. The third value is - \false{} if no compiler diagnostics other than style warnings were - issued. A non-\false{} value indicates that there were ``serious'' - compiler diagnostics issued, or that other conditions of type - \tindexed{error} or \tindexed{warning} (but not - \tindexed{style-warning}) were signaled during compilation. -\end{defun} - - -\begin{defun}{}{compile-file}{ - \args{\var{input-pathname} - \keys{\kwd{output-file} \kwd{error-file} \kwd{trace-file}} - \morekeys{\kwd{error-output} \kwd{verbose} \kwd{print} \kwd{progress}} - \yetmorekeys{\kwd{load} \kwd{block-compile} \kwd{entry-points}} - \yetmorekeys{\kwd{byte-compile}}}} - - The \cmucl{} \code{compile-file} is extended through the addition of - several new keywords and an additional interpretation of - \var{input-pathname}: - \begin{Lentry} - - \item[\var{input-pathname}] If this argument is a list of input - files, rather than a single input pathname, then all the source - files are compiled into a single object file. In this case, the - name of the first file is used to determine the default output - file names. This is especially useful in combination with - \var{block-compile}. - - \item[\kwd{output-file}] This argument specifies the name of the - output file. \true{} gives the default name, \false{} suppresses - the output file. - - \item[\kwd{error-file}] A listing of all the error output is - directed to this file. If there are no errors, then no error file - is produced (and any existing error file is deleted.) \true{} - gives \w{"\var{name}\code{.err}"} (the default), and \false{} - suppresses the output file. - - \item[\kwd{error-output}] If \true{} (the default), then error - output is sent to \code{*error-output*}. If a stream, then output - is sent to that stream instead. If \false, then error output is - suppressed. Note that this error output is in addition to (but - the same as) the output placed in the \var{error-file}. - - \item[\kwd{verbose}] If \true{} (the default), then the compiler - prints to error output at the start and end of compilation of each - file. See \varref{compile-verbose}. - - \item[\kwd{print}] If \true{} (the default), then the compiler - prints to error output when each function is compiled. See - \varref{compile-print}. - - \item[\kwd{progress}] If \true{} (default \false{}), then the - compiler prints to error output progress information about the - phases of compilation of each function. This is a CMU extension - that is useful mainly in large block compilations. See - \varref{compile-progress}. - - \item[\kwd{trace-file}] If \true{}, several of the intermediate - representations (including annotated assembly code) are dumped out - to this file. \true{} gives \w{"\var{name}\code{.trace}"}. Trace - output is off by default. \xlref{trace-files}. - - \item[\kwd{load}] If \true{}, load the resulting output file. - - \item[\kwd{block-compile}] Controls the compile-time resolution of - function calls. By default, only self-recursive calls are - resolved, unless an \code{ext:block-start} declaration appears in - the source file. \xlref{compile-file-block}. - - \item[\kwd{entry-points}] If non-null, then this is a list of the - names of all functions in the file that should have global - definitions installed (because they are referenced in other - files.) \xlref{compile-file-block}. - - \item[\kwd{byte-compile}] If \true{}, compiling to a compact - interpreted byte code is enabled. Possible values are \true{}, - \false{}, and \kwd{maybe} (the default.) See - \varref{byte-compile-default} and \xlref{byte-compile}. - \end{Lentry} - - The return values are as per the proposed X3J13 cleanup - ``compiler-diagnostics''. The first value from \code{compile-file} - is the truename of the output file, or \false{} if the file could - not be created. The interpretation of the second and third values - is described above for \code{compile}. -\end{defun} - -\begin{defvar}{}{compile-verbose} - \defvarx{compile-print} - \defvarx{compile-progress} - - These variables determine the default values for the \kwd{verbose}, - \kwd{print} and \kwd{progress} arguments to \code{compile-file}. -\end{defvar} - -\begin{defun}{extensions:}{compile-from-stream}{% - \args{\var{input-stream} - \keys{\kwd{error-stream}} - \morekeys{\kwd{trace-stream}} - \yetmorekeys{\kwd{block-compile} \kwd{entry-points}} - \yetmorekeys{\kwd{byte-compile}}}} - - This function is similar to \code{compile-file}, but it takes all - its arguments as streams. It reads \llisp{} code from - \var{input-stream} until end of file is reached, compiling into the - current environment. This function returns the same two values as - the last two values of \code{compile}. No output files are - produced. -\end{defun} - - -%% -%%\node Compilation Units, Interpreting Error Messages, Calling the Compiler, The Compiler -\section{Compilation Units} -\cpsubindex{compilation}{units} - -\cmucl{} supports the \code{with-compilation-unit} macro added to the -language by the proposed X3J13 ``with-compilation-unit'' compiler -cleanup. This provides a mechanism for eliminating spurious undefined -warnings when there are forward references across files, and also -provides a standard way to access compiler extensions. - -\begin{defmac}{}{with-compilation-unit}{% - \args{(\mstar{\var{key} \var{value}}) \mstar{\var{form}}}} - - This macro evaluates the \var{forms} in an environment that causes - warnings for undefined variables, functions and types to be delayed - until all the forms have been evaluated. Each keyword \var{value} - is an evaluated form. These keyword options are recognized: - \begin{Lentry} - - \item[\kwd{override}] If uses of \code{with-compilation-unit} are - dynamically nested, the outermost use will take precedence, - suppressing printing of undefined warnings by inner uses. - However, when the \code{override} option is true this shadowing is - inhibited; an inner use will print summary warnings for the - compilations within the inner scope. - - \item[\kwd{optimize}] This is a CMU extension that specifies of the - ``global'' compilation policy for the dynamic extent of the body. - The argument should evaluate to an \code{optimize} declare form, - like: - \begin{lisp} - (optimize (speed 3) (safety 0)) - \end{lisp} - \xlref{optimize-declaration} - - \item[\kwd{optimize-interface}] Similar to \kwd{optimize}, but - specifies the compilation policy for function interfaces (argument - count and type checking) for the dynamic extent of the body. - \xlref{optimize-interface-declaration}. - - \item[\kwd{context-declarations}] This is a CMU extension that - pattern-matches on function names, automatically splicing in any - appropriate declarations at the head of the function definition. - \xlref{context-declarations}. - \end{Lentry} -\end{defmac} - -\begin{comment} -* Undefined Warnings:: -\end{comment} - -%%\node Undefined Warnings, , Compilation Units, Compilation Units -\subsection{Undefined Warnings} - -\cindex{undefined warnings} -Warnings about undefined variables, functions and types are delayed until the -end of the current compilation unit. The compiler entry functions -(\code{compile}, etc.) implicitly use \code{with-compilation-unit}, so undefined -warnings will be printed at the end of the compilation unless there is an -enclosing \code{with-compilation-unit}. In order the gain the benefit of this -mechanism, you should wrap a single \code{with-compilation-unit} around the calls -to \code{compile-file}, i.e.: -\begin{lisp} -(with-compilation-unit () - (compile-file "file1") - (compile-file "file2") - ...) -\end{lisp} - -Unlike for functions and types, undefined warnings for variables are -not suppressed when a definition (e.g. \code{defvar}) appears after -the reference (but in the same compilation unit.) This is because -doing special declarations out of order just doesn't -work\dash{}although early references will be compiled as special, -bindings will be done lexically. - -Undefined warnings are printed with full source context -(\pxlref{error-messages}), which tremendously simplifies the problem -of finding undefined references that resulted from macroexpansion. -After printing detailed information about the undefined uses of each -name, \code{with-compilation-unit} also prints summary listings of the -names of all the undefined functions, types and variables. - -\begin{defvar}{}{undefined-warning-limit} - - This variable controls the number of undefined warnings for each - distinct name that are printed with full source context when the - compilation unit ends. If there are more undefined references than - this, then they are condensed into a single warning: - \begin{example} - Warning: \var{count} more uses of undefined function \var{name}. - \end{example} - When the value is \code{0}, then the undefined warnings are not - broken down by name at all: only the summary listing of undefined - names is printed. -\end{defvar} - -%% -%%\node Interpreting Error Messages, Types in Python, Compilation Units, The Compiler -\section{Interpreting Error Messages} -\label{error-messages} -\cpsubindex{error messages}{compiler} -\cindex{compiler error messages} - -One of \Python{}'s unique features is the level of source location -information it provides in error messages. The error messages contain -a lot of detail in a terse format, to they may be confusing at first. -Error messages will be illustrated using this example program: -\begin{lisp} -(defmacro zoq (x) - `(roq (ploq (+ ,x 3)))) - -(defun foo (y) - (declare (symbol y)) - (zoq y)) -\end{lisp} -The main problem with this program is that it is trying to add \code{3} to a -symbol. Note also that the functions \code{roq} and \code{ploq} aren't defined -anywhere. - -\begin{comment} -* The Parts of the Error Message:: -* The Original and Actual Source:: -* The Processing Path:: -* Error Severity:: -* Errors During Macroexpansion:: -* Read Errors:: -* Error Message Parameterization:: -\end{comment} - -%%\node The Parts of the Error Message, The Original and Actual Source, Interpreting Error Messages, Interpreting Error Messages -\subsection{The Parts of the Error Message} - -The compiler will produce this warning: -\begin{example} -File: /usr/me/stuff.lisp - -In: DEFUN FOO - (ZOQ Y) ---> ROQ PLOQ + -==> - Y -Warning: Result is a SYMBOL, not a NUMBER. -\end{example} -In this example we see each of the six possible parts of a compiler error -message: -\begin{Lentry} - -\item[\w{\code{File: /usr/me/stuff.lisp}}] This is the \var{file} that - the compiler read the relevant code from. The file name is - displayed because it may not be immediately obvious when there is an - error during compilation of a large system, especially when - \code{with-compilation-unit} is used to delay undefined warnings. - -\item[\w{\code{In: DEFUN FOO}}] This is the \var{definition} or - top-level form responsible for the error. It is obtained by taking - the first two elements of the enclosing form whose first element is - a symbol beginning with ``\code{DEF}''. If there is no enclosing - \w{\var{def}mumble}, then the outermost form is used. If there are - multiple \w{\var{def}mumbles}, then they are all printed from the - out in, separated by \code{$=>$}'s. In this example, the problem - was in the \code{defun} for \code{foo}. - -\item[\w{\code{(ZOQ Y)}}] This is the \i{original source} form - responsible for the error. Original source means that the form - directly appeared in the original input to the compiler, i.e. in the - lambda passed to \code{compile} or the top-level form read from the - source file. In this example, the expansion of the \code{zoq} macro - was responsible for the error. - -\item[\w{\code{--$>$ ROQ PLOQ +}} ] This is the \i{processing path} - that the compiler used to produce the errorful code. The processing - path is a representation of the evaluated forms enclosing the actual - source that the compiler encountered when processing the original - source. The path is the first element of each form, or the form - itself if the form is not a list. These forms result from the - expansion of macros or source-to-source transformation done by the - compiler. In this example, the enclosing evaluated forms are the - calls to \code{roq}, \code{ploq} and \code{+}. These calls resulted - from the expansion of the \code{zoq} macro. - -\item[\code{==$>$ Y}] This is the \i{actual source} responsible for - the error. If the actual source appears in the explanation, then we - print the next enclosing evaluated form, instead of printing the - actual source twice. (This is the form that would otherwise have - been the last form of the processing path.) In this example, the - problem is with the evaluation of the reference to the variable - \code{y}. - -\item[\w{\code{Warning: Result is a SYMBOL, not a NUMBER.}}] This is - the \var{explanation} the problem. In this example, the problem is - that \code{y} evaluates to a \code{symbol}, but is in a context - where a number is required (the argument to \code{+}). -\end{Lentry} - -Note that each part of the error message is distinctively marked: -\begin{itemize} - -\item \code{File:} and \code{In:} mark the file and definition, - respectively. - -\item The original source is an indented form with no prefix. - -\item Each line of the processing path is prefixed with \code{--$>$}. - -\item The actual source form is indented like the original source, but - is marked by a preceding \code{==$>$} line. This is like the - ``macroexpands to'' notation used in \cltl. - -\item The explanation is prefixed with the error severity - (\pxlref{error-severity}), either \code{Error:}, \code{Warning:}, or - \code{Note:}. -\end{itemize} - - -Each part of the error message is more specific than the preceding -one. If consecutive error messages are for nearby locations, then the -front part of the error messages would be the same. In this case, the -compiler omits as much of the second message as in common with the -first. For example: -\begin{example} -File: /usr/me/stuff.lisp - -In: DEFUN FOO - (ZOQ Y) ---> ROQ -==> - (PLOQ (+ Y 3)) -Warning: Undefined function: PLOQ - -==> - (ROQ (PLOQ (+ Y 3))) -Warning: Undefined function: ROQ -\end{example} -In this example, the file, definition and original source are -identical for the two messages, so the compiler omits them in the -second message. If consecutive messages are entirely identical, then -the compiler prints only the first message, followed by: -\begin{example} -[Last message occurs \var{repeats} times] -\end{example} -where \var{repeats} is the number of times the message was given. - -If the source was not from a file, then no file line is printed. If -the actual source is the same as the original source, then the -processing path and actual source will be omitted. If no forms -intervene between the original source and the actual source, then the -processing path will also be omitted. - -%% -%%\node The Original and Actual Source, The Processing Path, The Parts of the Error Message, Interpreting Error Messages -\subsection{The Original and Actual Source} -\cindex{original source} -\cindex{actual source} - -The \i{original source} displayed will almost always be a list. If the actual -source for an error message is a symbol, the original source will be the -immediately enclosing evaluated list form. So even if the offending symbol -does appear in the original source, the compiler will print the enclosing list -and then print the symbol as the actual source (as though the symbol were -introduced by a macro.) - -When the \i{actual source} is displayed (and is not a symbol), it will always -be code that resulted from the expansion of a macro or a source-to-source -compiler optimization. This is code that did not appear in the original -source program; it was introduced by the compiler. - -Keep in mind that when the compiler displays a source form in an error message, -it always displays the most specific (innermost) responsible form. For -example, compiling this function: -\begin{lisp} -(defun bar (x) - (let (a) - (declare (fixnum a)) - (setq a (foo x)) - a)) -\end{lisp} -Gives this error message: -\begin{example} -In: DEFUN BAR - (LET (A) (DECLARE (FIXNUM A)) (SETQ A (FOO X)) A) -Warning: The binding of A is not a FIXNUM: - NIL -\end{example} -This error message is not saying ``there's a problem somewhere in this -\code{let}''\dash{}it is saying that there is a problem with the -\code{let} itself. In this example, the problem is that \code{a}'s -\false{} initial value is not a \code{fixnum}. - -%% -%%\node The Processing Path, Error Severity, The Original and Actual Source, Interpreting Error Messages -\subsection{The Processing Path} -\cindex{processing path} -\cindex{macroexpansion} -\cindex{source-to-source transformation} - -The processing path is mainly useful for debugging macros, so if you don't -write macros, you can ignore the processing path. Consider this example: -\begin{lisp} -(defun foo (n) - (dotimes (i n *undefined*))) -\end{lisp} -Compiling results in this error message: -\begin{example} -In: DEFUN FOO - (DOTIMES (I N *UNDEFINED*)) ---> DO BLOCK LET TAGBODY RETURN-FROM -==> - (PROGN *UNDEFINED*) -Warning: Undefined variable: *UNDEFINED* -\end{example} -Note that \code{do} appears in the processing path. This is because \code{dotimes} -expands into: -\begin{lisp} -(do ((i 0 (1+ i)) (#:g1 n)) - ((>= i #:g1) *undefined*) - (declare (type unsigned-byte i))) -\end{lisp} -The rest of the processing path results from the expansion of \code{do}: -\begin{lisp} -(block nil - (let ((i 0) (#:g1 n)) - (declare (type unsigned-byte i)) - (tagbody (go #:g3) - #:g2 (psetq i (1+ i)) - #:g3 (unless (>= i #:g1) (go #:g2)) - (return-from nil (progn *undefined*))))) -\end{lisp} -In this example, the compiler descended into the \code{block}, -\code{let}, \code{tagbody} and \code{return-from} to reach the -\code{progn} printed as the actual source. This is a place where the -``actual source appears in explanation'' rule was applied. The -innermost actual source form was the symbol \code{*undefined*} itself, -but that also appeared in the explanation, so the compiler backed out -one level. - -%% -%%\node Error Severity, Errors During Macroexpansion, The Processing Path, Interpreting Error Messages -\subsection{Error Severity} -\label{error-severity} -\cindex{severity of compiler errors} -\cindex{compiler error severity} - -There are three levels of compiler error severity: -\begin{Lentry} - -\item[Error] This severity is used when the compiler encounters a - problem serious enough to prevent normal processing of a form. - Instead of compiling the form, the compiler compiles a call to - \code{error}. Errors are used mainly for signaling syntax errors. - If an error happens during macroexpansion, the compiler will handle - it. The compiler also handles and attempts to proceed from read - errors. - -\item[Warning] Warnings are used when the compiler can prove that - something bad will happen if a portion of the program is executed, - but the compiler can proceed by compiling code that signals an error - at runtime if the problem has not been fixed: - \begin{itemize} - - \item Violation of type declarations, or - - \item Function calls that have the wrong number of arguments or - malformed keyword argument lists, or - - \item Referencing a variable declared \code{ignore}, or unrecognized - declaration specifiers. - \end{itemize} - - In the language of the \clisp{} standard, these are situations where - the compiler can determine that a situation with undefined - consequences or that would cause an error to be signaled would - result at runtime. - -\item[Note] Notes are used when there is something that seems a bit - odd, but that might reasonably appear in correct programs. -\end{Lentry} -Note that the compiler does not fully conform to the proposed X3J13 -``compiler-diagnostics'' cleanup. Errors, warnings and notes mostly -correspond to errors, warnings and style-warnings, but many things -that the cleanup considers to be style-warnings are printed as -warnings rather than notes. Also, warnings, style-warnings and most -errors aren't really signaled using the condition system. - -%% -%%\node Errors During Macroexpansion, Read Errors, Error Severity, Interpreting Error Messages -\subsection{Errors During Macroexpansion} -\cpsubindex{macroexpansion}{errors during} - -The compiler handles errors that happen during macroexpansion, turning -them into compiler errors. If you want to debug the error (to debug a -macro), you can set \code{*break-on-signals*} to \code{error}. For -example, this definition: -\begin{lisp} -(defun foo (e l) - (do ((current l (cdr current)) - ((atom current) nil)) - (when (eq (car current) e) (return current)))) -\end{lisp} -gives this error: -\begin{example} -In: DEFUN FOO - (DO ((CURRENT L #) (# NIL)) (WHEN (EQ # E) (RETURN CURRENT)) ) -Error: (during macroexpansion) - -Error in function LISP::DO-DO-BODY. -DO step variable is not a symbol: (ATOM CURRENT) -\end{example} - - -%% -%%\node Read Errors, Error Message Parameterization, Errors During Macroexpansion, Interpreting Error Messages -\subsection{Read Errors} -\cpsubindex{read errors}{compiler} - -The compiler also handles errors while reading the source. For example: -\begin{example} -Error: Read error at 2: - "(,/\back{foo})" -Error in function LISP::COMMA-MACRO. -Comma not inside a backquote. -\end{example} -The ``\code{at 2}'' refers to the character position in the source file at -which the error was signaled, which is generally immediately after the -erroneous text. The next line, ``\code{(,/\back{foo})}'', is the line in -the source that contains the error file position. The ``\code{/\back{} }'' -indicates the error position within that line (in this example, -immediately after the offending comma.) - -When in \hemlock{} (or any other EMACS-like editor), you can go to a -character position with: -\begin{example} -M-< C-u \var{position} C-f -\end{example} -Note that if the source is from a \hemlock{} buffer, then the position -is relative to the start of the compiled region or \code{defun}, not the -file or buffer start. - -After printing a read error message, the compiler attempts to recover from the -error by backing up to the start of the enclosing top-level form and reading -again with \code{*read-suppress*} true. If the compiler can recover from the -error, then it substitutes a call to \code{cerror} for the unreadable form and -proceeds to compile the rest of the file normally. - -If there is a read error when the file position is at the end of the file -(i.e., an unexpected EOF error), then the error message looks like this: -\begin{example} -Error: Read error in form starting at 14: - "(defun test ()" -Error in function LISP::FLUSH-WHITESPACE. -EOF while reading # -\end{example} -In this case, ``\code{starting at 14}'' indicates the character -position at which the compiler started reading, i.e. the position -before the start of the form that was missing the closing delimiter. -The line \w{"\code{(defun test ()}"} is first line after the starting -position that the compiler thinks might contain the unmatched open -delimiter. - -%% -%%\node Error Message Parameterization, , Read Errors, Interpreting Error Messages -\subsection{Error Message Parameterization} -\cpsubindex{error messages}{verbosity} -\cpsubindex{verbosity}{of error messages} - -There is some control over the verbosity of error messages. See also -\varref{undefined-warning-limit}, \code{*efficiency-note-limit*} and -\varref{efficiency-note-cost-threshold}. - -\begin{defvar}{}{enclosing-source-cutoff} - - This variable specifies the number of enclosing actual source forms - that are printed in full, rather than in the abbreviated processing - path format. Increasing the value from its default of \code{1} - allows you to see more of the guts of the macroexpanded source, - which is useful when debugging macros. -\end{defvar} - -\begin{defvar}{}{error-print-length} - \defvarx{error-print-level} - - These variables are the print level and print length used in - printing error messages. The default values are \code{5} and - \code{3}. If null, the global values of \code{*print-level*} and - \code{*print-length*} are used. -\end{defvar} - -\begin{defmac}{extensions:}{def-source-context}{% - \args{\var{name} \var{lambda-list} \mstar{form}}} - - This macro defines how to extract an abbreviated source context from - the \var{name}d form when it appears in the compiler input. - \var{lambda-list} is a \code{defmacro} style lambda-list used to - parse the arguments. The \var{body} should return a list of - subforms that can be printed on about one line. There are - predefined methods for \code{defstruct}, \code{defmethod}, etc. If - no method is defined, then the first two subforms are returned. - Note that this facility implicitly determines the string name - associated with anonymous functions. -\end{defmac} - -%% -%%\node Types in Python, Getting Existing Programs to Run, Interpreting Error Messages, The Compiler -\section{Types in Python} -\cpsubindex{types}{in python} - -A big difference between \Python{} and all other \llisp{} compilers -is the approach to type checking and amount of knowledge about types: -\begin{itemize} - -\item \Python{} treats type declarations much differently that other - Lisp compilers do. \Python{} doesn't blindly believe type - declarations; it considers them assertions about the program that - should be checked. - -\item \Python{} also has a tremendously greater knowledge of the - \clisp{} type system than other compilers. Support is incomplete - only for the \code{not}, \code{and} and \code{satisfies} types. -\end{itemize} -See also sections \ref{advanced-type-stuff} and \ref{type-inference}. - -%% -\begin{comment} -* Compile Time Type Errors:: -* Precise Type Checking:: -* Weakened Type Checking:: -\end{comment} - -%%\node Compile Time Type Errors, Precise Type Checking, Types in Python, Types in Python -\subsection{Compile Time Type Errors} -\cindex{compile time type errors} -\cpsubindex{type checking}{at compile time} - -If the compiler can prove at compile time that some portion of the -program cannot be executed without a type error, then it will give a -warning at compile time. It is possible that the offending code would -never actually be executed at run-time due to some higher level -consistency constraint unknown to the compiler, so a type warning -doesn't always indicate an incorrect program. For example, consider -this code fragment: -\begin{lisp} -(defun raz (foo) - (let ((x (case foo - (:this 13) - (:that 9) - (:the-other 42)))) - (declare (fixnum x)) - (foo x))) -\end{lisp} -Compilation produces this warning: -\begin{example} -In: DEFUN RAZ - (CASE FOO (:THIS 13) (:THAT 9) (:THE-OTHER 42)) ---> LET COND IF COND IF COND IF -==> - (COND) -Warning: This is not a FIXNUM: - NIL -\end{example} -In this case, the warning is telling you that if \code{foo} isn't any -of \kwd{this}, \kwd{that} or \kwd{the-other}, then \code{x} will be -initialized to \false, which the \code{fixnum} declaration makes -illegal. The warning will go away if \code{ecase} is used instead of -\code{case}, or if \kwd{the-other} is changed to \true. - -This sort of spurious type warning happens moderately often in the -expansion of complex macros and in inline functions. In such cases, -there may be dead code that is impossible to correctly execute. The -compiler can't always prove this code is dead (could never be -executed), so it compiles the erroneous code (which will always signal -an error if it is executed) and gives a warning. - -\begin{defun}{extensions:}{required-argument}{} - - This function can be used as the default value for keyword arguments - that must always be supplied. Since it is known by the compiler to - never return, it will avoid any compile-time type warnings that - would result from a default value inconsistent with the declared - type. When this function is called, it signals an error indicating - that a required keyword argument was not supplied. This function is - also useful for \code{defstruct} slot defaults corresponding to - required arguments. \xlref{empty-type}. - - Although this function is a CMU extension, it is relatively harmless - to use it in otherwise portable code, since you can easily define it - yourself: - \begin{lisp} - (defun required-argument () - (error "A required keyword argument was not supplied.")) - \end{lisp} -\end{defun} - -Type warnings are inhibited when the -\code{extensions:inhibit-warnings} optimization quality is \code{3} -(\pxlref{compiler-policy}.) This can be used in a local declaration -to inhibit type warnings in a code fragment that has spurious -warnings. - -%% -%%\node Precise Type Checking, Weakened Type Checking, Compile Time Type Errors, Types in Python -\subsection{Precise Type Checking} -\label{precise-type-checks} -\cindex{precise type checking} -\cpsubindex{type checking}{precise} - -With the default compilation policy, all type -assertions\footnote{There are a few circumstances where a type - declaration is discarded rather than being used as type assertion. - This doesn't affect safety much, since such discarded declarations - are also not believed to be true by the compiler.} are precisely -checked. Precise checking means that the check is done as though -\code{typep} had been called with the exact type specifier that -appeared in the declaration. \Python{} uses \var{policy} to determine -whether to trust type assertions (\pxlref{compiler-policy}). Type -assertions from declarations are indistinguishable from the type -assertions on arguments to built-in functions. In \Python, adding -type declarations makes code safer. - -If a variable is declared to be \w{\code{(integer 3 17)}}, then its -value must always always be an integer between \code{3} and \code{17}. -If multiple type declarations apply to a single variable, then all the -declarations must be correct; it is as though all the types were -intersected producing a single \code{and} type specifier. - -Argument type declarations are automatically enforced. If you declare -the type of a function argument, a type check will be done when that -function is called. In a function call, the called function does the -argument type checking, which means that a more restrictive type -assertion in the calling function (e.g., from \code{the}) may be lost. - -The types of structure slots are also checked. The value of a -structure slot must always be of the type indicated in any \kwd{type} -slot option.\footnote{The initial value need not be of this type as - long as the corresponding argument to the constructor is always - supplied, but this will cause a compile-time type warning unless - \code{required-argument} is used.} Because of precise type checking, -the arguments to slot accessors are checked to be the correct type of -structure. - -In traditional \llisp{} compilers, not all type assertions are -checked, and type checks are not precise. Traditional compilers -blindly trust explicit type declarations, but may check the argument -type assertions for built-in functions. Type checking is not precise, -since the argument type checks will be for the most general type legal -for that argument. In many systems, type declarations suppress what -little type checking is being done, so adding type declarations makes -code unsafe. This is a problem since it discourages writing type -declarations during initial coding. In addition to being more error -prone, adding type declarations during tuning also loses all the -benefits of debugging with checked type assertions. - -To gain maximum benefit from \Python{}'s type checking, you should -always declare the types of function arguments and structure slots as -precisely as possible. This often involves the use of \code{or}, -\code{member} and other list-style type specifiers. Paradoxically, -even though adding type declarations introduces type checks, it -usually reduces the overall amount of type checking. This is -especially true for structure slot type declarations. - -\Python{} uses the \code{safety} optimization quality (rather than -presence or absence of declarations) to choose one of three levels of -run-time type error checking: \pxlref{optimize-declaration}. -\xlref{advanced-type-stuff} for more information about types in -\Python. - -%% -%%\node Weakened Type Checking, , Precise Type Checking, Types in Python -\subsection{Weakened Type Checking} -\label{weakened-type-checks} -\cindex{weakened type checking} -\cpsubindex{type checking}{weakened} - -When the value for the \code{speed} optimization quality is greater -than \code{safety}, and \code{safety} is not \code{0}, then type -checking is weakened to reduce the speed and space penalty. In -structure-intensive code this can double the speed, yet still catch -most type errors. Weakened type checks provide a level of safety -similar to that of ``safe'' code in other \llisp{} compilers. - -A type check is weakened by changing the check to be for some -convenient supertype of the asserted type. For example, -\code{\w{(integer 3 17)}} is changed to \code{fixnum}, -\code{\w{(simple-vector 17)}} to \code{simple-vector}, and structure -types are changed to \code{structure}. A complex check like: -\begin{example} -(or node hunk (member :foo :bar :baz)) -\end{example} -will be omitted entirely (i.e., the check is weakened to \code{*}.) If -a precise check can be done for no extra cost, then no weakening is -done. - -Although weakened type checking is similar to type checking done by -other compilers, it is sometimes safer and sometimes less safe. -Weakened checks are done in the same places is precise checks, so all -the preceding discussion about where checking is done still applies. -Weakened checking is sometimes somewhat unsafe because although the -check is weakened, the precise type is still input into type -inference. In some contexts this will result in type inferences not -justified by the weakened check, and hence deletion of some type -checks that would be done by conventional compilers. - -For example, if this code was compiled with weakened checks: -\begin{lisp} -(defstruct foo - (a nil :type simple-string)) - -(defstruct bar - (a nil :type single-float)) - -(defun myfun (x) - (declare (type bar x)) - (* (bar-a x) 3.0)) -\end{lisp} -and \code{myfun} was passed a \code{foo}, then no type error would be -signaled, and we would try to multiply a \code{simple-vector} as -though it were a float (with unpredictable results.) This is because -the check for \code{bar} was weakened to \code{structure}, yet when -compiling the call to \code{bar-a}, the compiler thinks it knows it -has a \code{bar}. - -Note that normally even weakened type checks report the precise type -in error messages. For example, if \code{myfun}'s \code{bar} check is -weakened to \code{structure}, and the argument is \false{}, then the -error will be: -\begin{example} -Type-error in MYFUN: - NIL is not of type BAR -\end{example} -However, there is some speed and space cost for signaling a precise -error, so the weakened type is reported if the \code{speed} -optimization quality is \code{3} or \code{debug} quality is less than -\code{1}: -\begin{example} -Type-error in MYFUN: - NIL is not of type STRUCTURE -\end{example} -\xlref{optimize-declaration} for further discussion of the -\code{optimize} declaration. - -%% -%%\node Getting Existing Programs to Run, Compiler Policy, Types in Python, The Compiler -\section{Getting Existing Programs to Run} -\cpsubindex{existing programs}{to run} -\cpsubindex{types}{portability} -\cindex{compatibility with other Lisps} - -Since \Python{} does much more comprehensive type checking than other -Lisp compilers, \Python{} will detect type errors in many programs -that have been debugged using other compilers. These errors are -mostly incorrect declarations, although compile-time type errors can -find actual bugs if parts of the program have never been tested. - -Some incorrect declarations can only be detected by run-time type -checking. It is very important to initially compile programs with -full type checks and then test this version. After the checking -version has been tested, then you can consider weakening or -eliminating type checks. \b{This applies even to previously debugged - programs.} \Python{} does much more type inference than other -\llisp{} compilers, so believing an incorrect declaration does much -more damage. - -The most common problem is with variables whose initial value doesn't -match the type declaration. Incorrect initial values will always be -flagged by a compile-time type error, and they are simple to fix once -located. Consider this code fragment: -\begin{example} -(prog (foo) - (declare (fixnum foo)) - (setq foo ...) - ...) -\end{example} -Here the variable \code{foo} is given an initial value of \false, but -is declared to be a \code{fixnum}. Even if it is never read, the -initial value of a variable must match the declared type. There are -two ways to fix this problem. Change the declaration: -\begin{example} -(prog (foo) - (declare (type (or fixnum null) foo)) - (setq foo ...) - ...) -\end{example} -or change the initial value: -\begin{example} -(prog ((foo 0)) - (declare (fixnum foo)) - (setq foo ...) - ...) -\end{example} -It is generally preferable to change to a legal initial value rather -than to weaken the declaration, but sometimes it is simpler to weaken -the declaration than to try to make an initial value of the -appropriate type. - - -Another declaration problem occasionally encountered is incorrect -declarations on \code{defmacro} arguments. This probably usually -happens when a function is converted into a macro. Consider this -macro: -\begin{lisp} -(defmacro my-1+ (x) - (declare (fixnum x)) - `(the fixnum (1+ ,x))) -\end{lisp} -Although legal and well-defined \clisp, this meaning of this -definition is almost certainly not what the writer intended. For -example, this call is illegal: -\begin{lisp} -(my-1+ (+ 4 5)) -\end{lisp} -The call is illegal because the argument to the macro is \w{\code{(+ 4 - 5)}}, which is a \code{list}, not a \code{fixnum}. Because of -macro semantics, it is hardly ever useful to declare the types of -macro arguments. If you really want to assert something about the -type of the result of evaluating a macro argument, then put a -\code{the} in the expansion: -\begin{lisp} -(defmacro my-1+ (x) - `(the fixnum (1+ (the fixnum ,x)))) -\end{lisp} -In this case, it would be stylistically preferable to change this -macro back to a function and declare it inline. Macros have no -efficiency advantage over inline functions when using \Python. -\xlref{inline-expansion}. - - -Some more subtle problems are caused by incorrect declarations that -can't be detected at compile time. Consider this code: -\begin{example} -(do ((pos 0 (position #\back{a} string :start (1+ pos)))) - ((null pos)) - (declare (fixnum pos)) - ...) -\end{example} -Although \code{pos} is almost always a \code{fixnum}, it is \false{} -at the end of the loop. If this example is compiled with full type -checks (the default), then running it will signal a type error at the -end of the loop. If compiled without type checks, the program will go -into an infinite loop (or perhaps \code{position} will complain -because \w{\code{(1+ nil)}} isn't a sensible start.) Why? Because if -you compile without type checks, the compiler just quietly believes -the type declaration. Since \code{pos} is always a \code{fixnum}, it -is never \nil, so \w{\code{(null pos)}} is never true, and the loop -exit test is optimized away. Such errors are sometimes flagged by -unreachable code notes (\pxlref{dead-code-notes}), but it is still -important to initially compile any system with full type checks, even -if the system works fine when compiled using other compilers. - -In this case, the fix is to weaken the type declaration to -\w{\code{(or fixnum null)}}.\footnote{Actually, this declaration is - totally unnecessary in \Python, since it already knows - \code{position} returns a non-negative \code{fixnum} or \false.} -Note that there is usually little performance penalty for weakening a -declaration in this way. Any numeric operations in the body can still -assume the variable is a \code{fixnum}, since \false{} is not a legal -numeric argument. Another possible fix would be to say: -\begin{example} -(do ((pos 0 (position #\back{a} string :start (1+ pos)))) - ((null pos)) - (let ((pos pos)) - (declare (fixnum pos)) - ...)) -\end{example} -This would be preferable in some circumstances, since it would allow a -non-standard representation to be used for the local \code{pos} -variable in the loop body (see section \ref{ND-variables}.) - -In summary, remember that \i{all} values that a variable \i{ever} -has must be of the declared type, and that you should test using safe -code initially. -%% -%%\node Compiler Policy, Open Coding and Inline Expansion, Getting Existing Programs to Run, The Compiler -\section{Compiler Policy} -\label{compiler-policy} -\cpsubindex{policy}{compiler} -\cindex{compiler policy} - -The policy is what tells the compiler \var{how} to compile a program. -This is logically (and often textually) distinct from the program -itself. Broad control of policy is provided by the \code{optimize} -declaration; other declarations and variables control more specific -aspects of compilation. - -%% -\begin{comment} -* The Optimize Declaration:: -* The Optimize-Interface Declaration:: -\end{comment} - -%%\node The Optimize Declaration, The Optimize-Interface Declaration, Compiler Policy, Compiler Policy -\subsection{The Optimize Declaration} -\label{optimize-declaration} -\cindex{optimize declaration} -\cpsubindex{declarations}{\code{optimize}} - -The \code{optimize} declaration recognizes six different -\var{qualities}. The qualities are conceptually independent aspects -of program performance. In reality, increasing one quality tends to -have adverse effects on other qualities. The compiler compares the -relative values of qualities when it needs to make a trade-off; i.e., -if \code{speed} is greater than \code{safety}, then improve speed at -the cost of safety. - -The default for all qualities (except \code{debug}) is \code{1}. -Whenever qualities are equal, ties are broken according to a broad -idea of what a good default environment is supposed to be. Generally -this downplays \code{speed}, \code{compile-speed} and \code{space} in -favor of \code{safety} and \code{debug}. Novice and casual users -should stick to the default policy. Advanced users often want to -improve speed and memory usage at the cost of safety and -debuggability. - -If the value for a quality is \code{0} or \code{3}, then it may have a -special interpretation. A value of \code{0} means ``totally -unimportant'', and a \code{3} means ``ultimately important.'' These -extreme optimization values enable ``heroic'' compilation strategies -that are not always desirable and sometimes self-defeating. -Specifying more than one quality as \code{3} is not desirable, since -it doesn't tell the compiler which quality is most important. - - -These are the optimization qualities: -\begin{Lentry} - -\item[\code{speed}] \cindex{speed optimization quality}How fast the - program should is run. \code{speed 3} enables some optimizations - that hurt debuggability. - -\item[\code{compilation-speed}] \cindex{compilation-speed optimization - quality}How fast the compiler should run. Note that increasing - this above \code{safety} weakens type checking. - -\item[\code{space}] \cindex{space optimization quality}How much space - the compiled code should take up. Inline expansion is mostly - inhibited when \code{space} is greater than \code{speed}. A value - of \code{0} enables promiscuous inline expansion. Wide use of a - \code{0} value is not recommended, as it may waste so much space - that run time is slowed. \xlref{inline-expansion} for a discussion - of inline expansion. - -\item[\code{debug}] \cindex{debug optimization quality}How debuggable - the program should be. The quality is treated differently from the - other qualities: each value indicates a particular level of debugger - information; it is not compared with the other qualities. - \xlref{debugger-policy} for more details. - -\item[\code{safety}] \cindex{safety optimization quality}How much - error checking should be done. If \code{speed}, \code{space} or - \code{compilation-speed} is more important than \code{safety}, then - type checking is weakened (\pxlref{weakened-type-checks}). If - \code{safety} if \code{0}, then no run time error checking is done. - In addition to suppressing type checks, \code{0} also suppresses - argument count checking, unbound-symbol checking and array bounds - checks. - -\item[\code{extensions:inhibit-warnings}] \cindex{inhibit-warnings - optimization quality}This is a CMU extension that determines how - little (or how much) diagnostic output should be printed during - compilation. This quality is compared to other qualities to - determine whether to print style notes and warnings concerning those - qualities. If \code{speed} is greater than \code{inhibit-warnings}, - then notes about how to improve speed will be printed, etc. The - default value is \code{1}, so raising the value for any standard - quality above its default enables notes for that quality. If - \code{inhibit-warnings} is \code{3}, then all notes and most - non-serious warnings are inhibited. This is useful with - \code{declare} to suppress warnings about unavoidable problems. -\end{Lentry} - -%%\node The Optimize-Interface Declaration, , The Optimize Declaration, Compiler Policy -\subsection{The Optimize-Interface Declaration} -\label{optimize-interface-declaration} -\cindex{optimize-interface declaration} -\cpsubindex{declarations}{\code{optimize-interface}} - -The \code{extensions:optimize-interface} declaration is identical in -syntax to the \code{optimize} declaration, but it specifies the policy -used during compilation of code the compiler automatically generates -to check the number and type of arguments supplied to a function. It -is useful to specify this policy separately, since even thoroughly -debugged functions are vulnerable to being passed the wrong arguments. -The \code{optimize-interface} declaration can specify that arguments -should be checked even when the general \code{optimize} policy is -unsafe. - -Note that this argument checking is the checking of user-supplied -arguments to any functions defined within the scope of the -declaration, \code{not} the checking of arguments to \llisp{} -primitives that appear in those definitions. - -The idea behind this declaration is that it allows the definition of -functions that appear fully safe to other callers, but that do no -internal error checking. Of course, it is possible that arguments may -be invalid in ways other than having incorrect type. Functions -compiled unsafely must still protect themselves against things like -user-supplied array indices that are out of bounds and improper lists. -See also the \kwd{context-declarations} option to -\macref{with-compilation-unit}. - -%% -%%\node Open Coding and Inline Expansion, , Compiler Policy, The Compiler -\section{Open Coding and Inline Expansion} -\label{open-coding} -\cindex{open-coding} -\cindex{inline expansion} -\cindex{static functions} - -Since \clisp{} forbids the redefinition of standard functions\footnote{See the -proposed X3J13 ``lisp-symbol-redefinition'' cleanup.}, the compiler can have -special knowledge of these standard functions embedded in it. This special -knowledge is used in various ways (open coding, inline expansion, source -transformation), but the implications to the user are basically the same: -\begin{itemize} - -\item Attempts to redefine standard functions may be frustrated, since - the function may never be called. Although it is technically - illegal to redefine standard functions, users sometimes want to - implicitly redefine these functions when they are debugging using - the \code{trace} macro. Special-casing of standard functions can be - inhibited using the \code{notinline} declaration. - -\item The compiler can have multiple alternate implementations of - standard functions that implement different trade-offs of speed, - space and safety. This selection is based on the compiler policy, - \pxlref{compiler-policy}. -\end{itemize} - - -When a function call is \i{open coded}, inline code whose effect is -equivalent to the function call is substituted for that function call. -When a function call is \i{closed coded}, it is usually left as is, -although it might be turned into a call to a different function with -different arguments. As an example, if \code{nthcdr} were to be open -coded, then -\begin{lisp} -(nthcdr 4 foobar) -\end{lisp} -might turn into -\begin{lisp} -(cdr (cdr (cdr (cdr foobar)))) -\end{lisp} -or even -\begin{lisp} -(do ((i 0 (1+ i)) - (list foobar (cdr foobar))) - ((= i 4) list)) -\end{lisp} - -If \code{nth} is closed coded, then -\begin{lisp} -(nth x l) -\end{lisp} -might stay the same, or turn into something like: -\begin{lisp} -(car (nthcdr x l)) -\end{lisp} - -In general, open coding sacrifices space for speed, but some functions (such as -\code{car}) are so simple that they are always open-coded. Even when not -open-coded, a call to a standard function may be transformed into a different -function call (as in the last example) or compiled as \i{static call}. Static -function call uses a more efficient calling convention that forbids -redefinition. - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/efficiency.ms} - - - -\hide{ -*- Dictionary: cmu-user -*- } -%%\node Advanced Compiler Use and Efficiency Hints, UNIX Interface, The Compiler, Top -\chapter{Advanced Compiler Use and Efficiency Hints} -\begin{center} -\b{By Robert MacLachlan} -\end{center} -\vspace{1 cm} -\label{advanced-compiler} - -\begin{comment} -* Advanced Compiler Introduction:: -* More About Types in Python:: -* Type Inference:: -* Source Optimization:: -* Tail Recursion:: -* Local Call:: -* Block Compilation:: -* Inline Expansion:: -* Byte Coded Compilation:: -* Object Representation:: -* Numbers:: -* General Efficiency Hints:: -* Efficiency Notes:: -* Profiling:: -\end{comment} - -%%\node Advanced Compiler Introduction, More About Types in Python, Advanced Compiler Use and Efficiency Hints, Advanced Compiler Use and Efficiency Hints -\section{Advanced Compiler Introduction} - -In \cmucl, as is any language on any computer, the path to efficient -code starts with good algorithms and sensible programming techniques, -but to avoid inefficiency pitfalls, you need to know some of this -implementation's quirks and features. This chapter is mostly a fairly -long and detailed overview of what optimizations \python{} does. -Although there are the usual negative suggestions of inefficient -features to avoid, the main emphasis is on describing the things that -programmers can count on being efficient. - -The optimizations described here can have the effect of speeding up -existing programs written in conventional styles, but the potential -for new programming styles that are clearer and less error-prone is at -least as significant. For this reason, several sections end with a -discussion of the implications of these optimizations for programming -style. - -\begin{comment} -* Types:: -* Optimization:: -* Function Call:: -* Representation of Objects:: -* Writing Efficient Code:: -\end{comment} - -%%\node Types, Optimization, Advanced Compiler Introduction, Advanced Compiler Introduction -\subsection{Types} - -Python's support for types is unusual in three major ways: -\begin{itemize} - -\item Precise type checking encourages the specific use of type - declarations as a form of run-time consistency checking. This - speeds development by localizing type errors and giving more - meaningful error messages. \xlref{precise-type-checks}. \python{} - produces completely safe code; optimized type checking maintains - reasonable efficiency on conventional hardware - (\pxlref{type-check-optimization}.) - -\item Comprehensive support for the \clisp{} type system makes complex - type specifiers useful. Using type specifiers such as \code{or} and - \code{member} has both efficiency and robustness advantages. - \xlref{advanced-type-stuff}. - -\item Type inference eliminates the need for some declarations, and - also aids compile-time detection of type errors. Given detailed - type declarations, type inference can often eliminate type checks - and enable more efficient object representations and code sequences. - Checking all types results in fewer type checks. See sections - \ref{type-inference} and \ref{non-descriptor}. -\end{itemize} - - -%%\node Optimization, Function Call, Types, Advanced Compiler Introduction -\subsection{Optimization} - -The main barrier to efficient Lisp programs is not that there is no -efficient way to code the program in Lisp, but that it is difficult to -arrive at that efficient coding. Common Lisp is a highly complex -language, and usually has many semantically equivalent ``reasonable'' -ways to code a given problem. It is desirable to make all of these -equivalent solutions have comparable efficiency so that programmers -don't have to waste time discovering the most efficient solution. - -Source level optimization increases the number of efficient ways to -solve a problem. This effect is much larger than the increase in the -efficiency of the ``best'' solution. Source level optimization -transforms the original program into a more efficient (but equivalent) -program. Although the optimizer isn't doing anything the programmer -couldn't have done, this high-level optimization is important because: -\begin{itemize} - -\item The programmer can code simply and directly, rather than - obfuscating code to please the compiler. - -\item When presented with a choice of similar coding alternatives, the - programmer can chose whichever happens to be most convenient, - instead of worrying about which is most efficient. -\end{itemize} - -Source level optimization eliminates the need for macros to optimize -their expansion, and also increases the effectiveness of inline -expansion. See sections \ref{source-optimization} and -\ref{inline-expansion}. - -Efficient support for a safer programming style is the biggest -advantage of source level optimization. Existing tuned programs -typically won't benefit much from source optimization, since their -source has already been optimized by hand. However, even tuned -programs tend to run faster under \python{} because: -\begin{itemize} - -\item Low level optimization and register allocation provides modest - speedups in any program. - -\item Block compilation and inline expansion can reduce function call - overhead, but may require some program restructuring. See sections - \ref{inline-expansion}, \ref{local-call} and - \ref{block-compilation}. - -\item Efficiency notes will point out important type declarations that - are often missed even in highly tuned programs. - \xlref{efficiency-notes}. - -\item Existing programs can be compiled safely without prohibitive - speed penalty, although they would be faster and safer with added - declarations. \xlref{type-check-optimization}. - -\item The context declaration mechanism allows both space and runtime - of large systems to be reduced without sacrificing robustness by - semi-automatically varying compilation policy without addition any - \code{optimize} declarations to the source. - \xlref{context-declarations}. - -\item Byte compilation can be used to dramatically reduce the size of - code that is not speed-critical. \xlref{byte-compile} -\end{itemize} - - -%%\node Function Call, Representation of Objects, Optimization, Advanced Compiler Introduction -\subsection{Function Call} - -The sort of symbolic programs generally written in \llisp{} often -favor recursion over iteration, or have inner loops so complex that -they involve multiple function calls. Such programs spend a larger -fraction of their time doing function calls than is the norm in other -languages; for this reason \llisp{} implementations strive to make the -general (or full) function call as inexpensive as possible. \python{} -goes beyond this by providing two good alternatives to full call: -\begin{itemize} - -\item Local call resolves function references at compile time, - allowing better calling sequences and optimization across function - calls. \xlref{local-call}. - -\item Inline expansion totally eliminates call overhead and allows - many context dependent optimizations. This provides a safe and - efficient implementation of operations with function semantics, - eliminating the need for error-prone macro definitions or manual - case analysis. Although most \clisp{} implementations support - inline expansion, it becomes a more powerful tool with \python{}'s - source level optimization. See sections \ref{source-optimization} - and \ref{inline-expansion}. -\end{itemize} - - -Generally, \python{} provides simple implementations for simple uses -of function call, rather than having only a single calling convention. -These features allow a more natural programming style: -\begin{itemize} - -\item Proper tail recursion. \xlref{tail-recursion} - -\item Relatively efficient closures. - -\item A \code{funcall} that is as efficient as normal named call. - -\item Calls to local functions such as from \code{labels} are - optimized: -\begin{itemize} - -\item Control transfer is a direct jump. - -\item The closure environment is passed in registers rather than heap - allocated. - -\item Keyword arguments and multiple values are implemented more - efficiently. -\end{itemize} - -\xlref{local-call}. -\end{itemize} - -%%\node Representation of Objects, Writing Efficient Code, Function Call, Advanced Compiler Introduction -\subsection{Representation of Objects} - -Sometimes traditional \llisp{} implementation techniques compare so -poorly to the techniques used in other languages that \llisp{} can -become an impractical language choice. Terrible inefficiencies appear -in number-crunching programs, since \llisp{} numeric operations often -involve number-consing and generic arithmetic. \python{} supports -efficient natural representations for numbers (and some other types), -and allows these efficient representations to be used in more -contexts. \python{} also provides good efficiency notes that warn -when a crucial declaration is missing. - -See section \ref{non-descriptor} for more about object representations and -numeric types. Also \pxlref{efficiency-notes} about efficiency notes. - -%%\node Writing Efficient Code, , Representation of Objects, Advanced Compiler Introduction -\subsection{Writing Efficient Code} -\label{efficiency-overview} - -Writing efficient code that works is a complex and prolonged process. -It is important not to get so involved in the pursuit of efficiency -that you lose sight of what the original problem demands. Remember -that: -\begin{itemize} - -\item The program should be correct\dash{}it doesn't matter how - quickly you get the wrong answer. - -\item Both the programmer and the user will make errors, so the - program must be robust\dash{}it must detect errors in a way that - allows easy correction. - -\item A small portion of the program will consume most of the - resources, with the bulk of the code being virtually irrelevant to - efficiency considerations. Even experienced programmers familiar - with the problem area cannot reliably predict where these ``hot - spots'' will be. -\end{itemize} - - - -The best way to get efficient code that is still worth using, is to separate -coding from tuning. During coding, you should: -\begin{itemize} - -\item Use a coding style that aids correctness and robustness without - being incompatible with efficiency. - -\item Choose appropriate data structures that allow efficient - algorithms and object representations - (\pxlref{object-representation}). Try to make interfaces abstract - enough so that you can change to a different representation if - profiling reveals a need. - -\item Whenever you make an assumption about a function argument or - global data structure, add consistency assertions, either with type - declarations or explicit uses of \code{assert}, \code{ecase}, etc. -\end{itemize} - -During tuning, you should: -\begin{itemize} - -\item Identify the hot spots in the program through profiling (section - \ref{profiling}.) - -\item Identify inefficient constructs in the hot spot with efficiency - notes, more profiling, or manual inspection of the source. See - sections \ref{general-efficiency} and \ref{efficiency-notes}. - -\item Add declarations and consider the application of optimizations. - See sections \ref{local-call}, \ref{inline-expansion} and - \ref{non-descriptor}. - -\item If all else fails, consider algorithm or data structure changes. - If you did a good job coding, changes will be easy to introduce. -\end{itemize} - - - -%% -%%\node More About Types in Python, Type Inference, Advanced Compiler Introduction, Advanced Compiler Use and Efficiency Hints -\section{More About Types in Python} -\label{advanced-type-stuff} -\cpsubindex{types}{in python} - -This section goes into more detail describing what types and declarations are -recognized by \python. The area where \python{} differs most radically from -previous \llisp{} compilers is in its support for types: -\begin{itemize} - -\item Precise type checking helps to find bugs at run time. - -\item Compile-time type checking helps to find bugs at compile time. - -\item Type inference minimizes the need for generic operations, and - also increases the efficiency of run time type checking and the - effectiveness of compile time type checking. - -\item Support for detailed types provides a wealth of opportunity for - operation-specific type inference and optimization. -\end{itemize} - - - -\begin{comment} -* More Types Meaningful:: -* Canonicalization:: -* Member Types:: -* Union Types:: -* The Empty Type:: -* Function Types:: -* The Values Declaration:: -* Structure Types:: -* The Freeze-Type Declaration:: -* Type Restrictions:: -* Type Style Recommendations:: -\end{comment} - -%%\node More Types Meaningful, Canonicalization, More About Types in Python, More About Types in Python -\subsection{More Types Meaningful} - -\clisp{} has a very powerful type system, but conventional \llisp{} -implementations typically only recognize the small set of types -special in that implementation. In these systems, there is an -unfortunate paradox: a declaration for a relatively general type like -\code{fixnum} will be recognized by the compiler, but a highly -specific declaration such as \code{\w{(integer 3 17)}} is totally -ignored. - -This is obviously a problem, since the user has to know how to specify -the type of an object in the way the compiler wants it. A very -minimal (but rarely satisfied) criterion for type system support is -that it be no worse to make a specific declaration than to make a -general one. \python{} goes beyond this by exploiting a number of -advantages obtained from detailed type information. - -Using more restrictive types in declarations allows the compiler to do -better type inference and more compile-time type checking. Also, when -type declarations are considered to be consistency assertions that -should be verified (conditional on policy), then complex types are -useful for making more detailed assertions. - -Python ``understands'' the list-style \code{or}, \code{member}, -\code{function}, array and number type specifiers. Understanding -means that: -\begin{itemize} - -\item If the type contains more information than is used in a - particular context, then the extra information is simply ignored, - rather than derailing type inference. - -\item In many contexts, the extra information from these type - specifier is used to good effect. In particular, type checking in - \code{Python} is \var{precise}, so these complex types can be used - in declarations to make interesting assertions about functions and - data structures (\pxlref{precise-type-checks}.) More specific - declarations also aid type inference and reduce the cost for type - checking. -\end{itemize} - -For related information, \pxlref{numeric-types} for numeric types, and -section \ref{array-types} for array types. - - -%%\node Canonicalization, Member Types, More Types Meaningful, More About Types in Python -\subsection{Canonicalization} -\cpsubindex{types}{equivalence} -\cindex{canonicalization of types} -\cindex{equivalence of types} - -When given a type specifier, \python{} will often rewrite it into a -different (but equivalent) type. This is the mechanism that \python{} -uses for detecting type equivalence. For example, in \python{}'s -canonical representation, these types are equivalent: -\begin{example} -(or list (member :end)) \myequiv (or cons (member nil :end)) -\end{example} -This has two implications for the user: -\begin{itemize} - -\item The standard symbol type specifiers for \code{atom}, - \code{null}, \code{fixnum}, etc., are in no way magical. The - \tindexed{null} type is actually defined to be \code{\w{(member - nil)}}, \tindexed{list} is \code{\w{(or cons null)}}, and - \tindexed{fixnum} is \code{\w{(signed-byte 30)}}. - -\item When the compiler prints out a type, it may not look like the - type specifier that originally appeared in the program. This is - generally not a problem, but it must be taken into consideration - when reading compiler error messages. -\end{itemize} - - -%%\node Member Types, Union Types, Canonicalization, More About Types in Python -\subsection{Member Types} -\cindex{member types} - -The \tindexed{member} type specifier can be used to represent -``symbolic'' values, analogous to the enumerated types of Pascal. For -example, the second value of \code{find-symbol} has this type: -\begin{lisp} -(member :internal :external :inherited nil) -\end{lisp} -Member types are very useful for expressing consistency constraints on data -structures, for example: -\begin{lisp} -(defstruct ice-cream - (flavor :vanilla :type (member :vanilla :chocolate :strawberry))) -\end{lisp} -Member types are also useful in type inference, as the number of members can -sometimes be pared down to one, in which case the value is a known constant. - -%%\node Union Types, The Empty Type, Member Types, More About Types in Python -\subsection{Union Types} -\cindex{union (\code{or}) types} -\cindex{or (union) types} - -The \tindexed{or} (union) type specifier is understood, and is -meaningfully applied in many contexts. The use of \code{or} allows -assertions to be made about types in dynamically typed programs. For -example: -\begin{lisp} -(defstruct box - (next nil :type (or box null)) - (top :removed :type (or box-top (member :removed)))) -\end{lisp} -The type assertion on the \code{top} slot ensures that an error will be signaled -when there is an attempt to store an illegal value (such as \kwd{rmoved}.) -Although somewhat weak, these union type assertions provide a useful input into -type inference, allowing the cost of type checking to be reduced. For example, -this loop is safely compiled with no type checks: -\begin{lisp} -(defun find-box-with-top (box) - (declare (type (or box null) box)) - (do ((current box (box-next current))) - ((null current)) - (unless (eq (box-top current) :removed) - (return current)))) -\end{lisp} - -Union types are also useful in type inference for representing types that are -partially constrained. For example, the result of this expression: -\begin{lisp} -(if foo - (logior x y) - (list x y)) -\end{lisp} -can be expressed as \code{\w{(or integer cons)}}. - -%%\node The Empty Type, Function Types, Union Types, More About Types in Python -\subsection{The Empty Type} -\label{empty-type} -\cindex{NIL type} -\cpsubindex{empty type}{the} -\cpsubindex{errors}{result type of} - -The type \false{} is also called the empty type, since no object is of -type \false{}. The union of no types, \code{(or)}, is also empty. -\python{}'s interpretation of an expression whose type is \false{} is -that the expression never yields any value, but rather fails to -terminate, or is thrown out of. For example, the type of a call to -\code{error} or a use of \code{return} is \false{}. When the type of -an expression is empty, compile-time type warnings about its value are -suppressed; presumably somebody else is signaling an error. If a -function is declared to have return type \false{}, but does in fact -return, then (in safe compilation policies) a ``\code{NIL Function - returned}'' error will be signaled. See also the function -\funref{required-argument}. - -%%\node Function Types, The Values Declaration, The Empty Type, More About Types in Python -\subsection{Function Types} -\label{function-types} -\cpsubindex{function}{types} -\cpsubindex{types}{function} - -\findexed{function} types are understood in the restrictive sense, specifying: -\begin{itemize} - -\item The argument syntax that the function must be called with. This - is information about what argument counts are acceptable, and which - keyword arguments are recognized. In \python, warnings about - argument syntax are a consequence of function type checking. - -\item The types of the argument values that the caller must pass. If - the compiler can prove that some argument to a call is of a type - disallowed by the called function's type, then it will give a - compile-time type warning. In addition to being used for - compile-time type checking, these type assertions are also used as - output type assertions in code generation. For example, if - \code{foo} is declared to have a \code{fixnum} argument, then the - \code{1+} in \w{\code{(foo (1+ x))}} is compiled with knowledge that - the result must be a fixnum. - -\item The types the values that will be bound to argument variables in - the function's definition. Declaring a function's type with - \code{ftype} implicitly declares the types of the arguments in the - definition. \python{} checks for consistency between the definition - and the \code{ftype} declaration. Because of precise type checking, - an error will be signaled when a function is called with an - argument of the wrong type. - -\item The type of return value(s) that the caller can expect. This - information is a useful input to type inference. For example, if a - function is declared to return a \code{fixnum}, then when a call to - that function appears in an expression, the expression will be - compiled with knowledge that the call will return a \code{fixnum}. - -\item The type of return value(s) that the definition must return. - The result type in an \code{ftype} declaration is treated like an - implicit \code{the} wrapped around the body of the definition. If - the definition returns a value of the wrong type, an error will be - signaled. If the compiler can prove that the function returns the - wrong type, then it will give a compile-time warning. -\end{itemize} - -This is consistent with the new interpretation of function types and -the \code{ftype} declaration in the proposed X3J13 -``function-type-argument-type-semantics'' cleanup. Note also, that if -you don't explicitly declare the type of a function using a global -\code{ftype} declaration, then \python{} will compute a function type -from the definition, providing a degree of inter-routine type -inference, \pxlref{function-type-inference}. - -%%\node The Values Declaration, Structure Types, Function Types, More About Types in Python -\subsection{The Values Declaration} -\cindex{values declaration} - -\cmucl{} supports the \code{values} declaration as an extension to -\clisp. The syntax is {\code{(values \var{type1} - \var{type2}$\ldots$\var{typen})}}. This declaration is -semantically equivalent to a \code{the} form wrapped around the body -of the special form in which the \code{values} declaration appears. -The advantage of \code{values} over \findexed{the} is purely -syntactic\dash{}it doesn't introduce more indentation. For example: -\begin{example} -(defun foo (x) - (declare (values single-float)) - (ecase x - (:this ...) - (:that ...) - (:the-other ...))) -\end{example} -is equivalent to: -\begin{example} -(defun foo (x) - (the single-float - (ecase x - (:this ...) - (:that ...) - (:the-other ...)))) -\end{example} -and -\begin{example} -(defun floor (number &optional (divisor 1)) - (declare (values integer real)) - ...) -\end{example} -is equivalent to: -\begin{example} -(defun floor (number &optional (divisor 1)) - (the (values integer real) - ...)) -\end{example} -In addition to being recognized by \code{lambda} (and hence by -\code{defun}), the \code{values} declaration is recognized by all the -other special forms with bodies and declarations: \code{let}, -\code{let*}, \code{labels} and \code{flet}. Macros with declarations -usually splice the declarations into one of the above forms, so they -will accept this declaration too, but the exact effect of a -\code{values} declaration will depend on the macro. - -If you declare the types of all arguments to a function, and also -declare the return value types with \code{values}, you have described -the type of the function. \python{} will use this argument and result -type information to derive a function type that will then be applied -to calls of the function (\pxlref{function-types}.) This provides a -way to declare the types of functions that is much less syntactically -awkward than using the \code{ftype} declaration with a \code{function} -type specifier. - -Although the \code{values} declaration is non-standard, it is -relatively harmless to use it in otherwise portable code, since any -warning in non-CMU implementations can be suppressed with the standard -\code{declaration} proclamation. - -%%\node Structure Types, The Freeze-Type Declaration, The Values Declaration, More About Types in Python -\subsection{Structure Types} -\label{structure-types} -\cindex{structure types} -\cindex{defstruct types} -\cpsubindex{types}{structure} - -Because of precise type checking, structure types are much better supported by -Python than by conventional compilers: -\begin{itemize} - -\item The structure argument to structure accessors is precisely - checked\dash{}if you call \code{foo-a} on a \code{bar}, an error - will be signaled. - -\item The types of slot values are precisely checked\dash{}if you pass - the wrong type argument to a constructor or a slot setter, then an - error will be signaled. -\end{itemize} -This error checking is tremendously useful for detecting bugs in -programs that manipulate complex data structures. - -An additional advantage of checking structure types and enforcing slot -types is that the compiler can safely believe slot type declarations. -\python{} effectively moves the type checking from the slot access to -the slot setter or constructor call. This is more efficient since -caller of the setter or constructor often knows the type of the value, -entirely eliminating the need to check the value's type. Consider -this example: -\begin{lisp} -(defstruct coordinate - (x nil :type single-float) - (y nil :type single-float)) - -(defun make-it () - (make-coordinate :x 1.0 :y 1.0)) - -(defun use-it (it) - (declare (type coordinate it)) - (sqrt (expt (coordinate-x it) 2) (expt (coordinate-y it) 2))) -\end{lisp} -\code{make-it} and \code{use-it} are compiled with no checking on the -types of the float slots, yet \code{use-it} can use -\code{single-float} arithmetic with perfect safety. Note that -\code{make-coordinate} must still check the values of \code{x} and -\code{y} unless the call is block compiled or inline expanded -(\pxlref{local-call}.) But even without this advantage, it is almost -always more efficient to check slot values on structure -initialization, since slots are usually written once and read many -times. - -%%\node The Freeze-Type Declaration, Type Restrictions, Structure Types, More About Types in Python -\subsection{The Freeze-Type Declaration} -\cindex{freeze-type declaration} -\label{freeze-type} - -The \code{extensions:freeze-type} declaration is a CMU extension that -enables more efficient compilation of user-defined types by asserting -that the definition is not going to change. This declaration may only -be used globally (with \code{declaim} or \code{proclaim}). Currently -\code{freeze-type} only affects structure type testing done by -\code{typep}, \code{typecase}, etc. Here is an example: -\begin{lisp} -(declaim (freeze-type foo bar)) -\end{lisp} -This asserts that the types \code{foo} and \code{bar} and their -subtypes are not going to change. This allows more efficient type -testing, since the compiler can open-code a test for all possible -subtypes, rather than having to examine the type hierarchy at -run-time. - -%%\node Type Restrictions, Type Style Recommendations, The Freeze-Type Declaration, More About Types in Python -\subsection{Type Restrictions} -\cpsubindex{types}{restrictions on} - -Avoid use of the \code{and}, \code{not} and \code{satisfies} types in -declarations, since type inference has problems with them. When these -types do appear in a declaration, they are still checked precisely, -but the type information is of limited use to the compiler. -\code{and} types are effective as long as the intersection can be -canonicalized to a type that doesn't use \code{and}. For example: -\begin{example} -(and fixnum unsigned-byte) -\end{example} -is fine, since it is the same as: -\begin{example} -(integer 0 \var{most-positive-fixnum}) -\end{example} -but this type: -\begin{example} -(and symbol (not (member :end))) -\end{example} -will not be fully understood by type interference since the \code{and} -can't be removed by canonicalization. - -Using any of these type specifiers in a type test with \code{typep} or -\code{typecase} is fine, since as tests, these types can be translated -into the \code{and} macro, the \code{not} function or a call to the -satisfies predicate. - -%%\node Type Style Recommendations, , Type Restrictions, More About Types in Python -\subsection{Type Style Recommendations} -\cindex{style recommendations} - -Python provides good support for some currently unconventional ways of -using the \clisp{} type system. With \python, it is desirable to make -declarations as precise as possible, but type inference also makes -some declarations unnecessary. Here are some general guidelines for -maximum robustness and efficiency: -\begin{itemize} - -\item Declare the types of all function arguments and structure slots - as precisely as possible (while avoiding \code{not}, \code{and} and - \code{satisfies}). Put these declarations in during initial coding - so that type assertions can find bugs for you during debugging. - -\item Use the \tindexed{member} type specifier where there are a small - number of possible symbol values, for example: \w{\code{(member :red - :blue :green)}}. - -\item Use the \tindexed{or} type specifier in situations where the - type is not certain, but there are only a few possibilities, for - example: \w{\code{(or list vector)}}. - -\item Declare integer types with the tightest bounds that you can, - such as \code{\w{(integer 3 7)}}. - -\item Define \findexed{deftype} or \findexed{defstruct} types before - they are used. Definition after use is legal (producing no - ``undefined type'' warnings), but type tests and structure - operations will be compiled much less efficiently. - -\item Use the \code{extensions:freeze-type} declaration to speed up - type testing for structure types which won't have new subtypes added - later. \xlref{freeze-type} - -\item In addition to declaring the array element type and simpleness, - also declare the dimensions if they are fixed, for example: - \begin{example} - (simple-array single-float (1024 1024)) - \end{example} - This bounds information allows array indexing for multi-dimensional - arrays to be compiled much more efficiently, and may also allow - array bounds checking to be done at compile time. - \xlref{array-types}. - -\item Avoid use of the \findexed{the} declaration within expressions. - Not only does it clutter the code, but it is also almost worthless - under safe policies. If the need for an output type assertion is - revealed by efficiency notes during tuning, then you can consider - \code{the}, but it is preferable to constrain the argument types - more, allowing the compiler to prove the desired result type. - -\item Don't bother declaring the type of \findexed{let} or other - non-argument variables unless the type is non-obvious. If you - declare function return types and structure slot types, then the - type of a variable is often obvious both to the programmer and to - the compiler. An important case where the type isn't obvious, and a - declaration is appropriate, is when the value for a variable is - pulled out of untyped structure (e.g., the result of \code{car}), or - comes from some weakly typed function, such as \code{read}. - -\item Declarations are sometimes necessary for integer loop variables, - since the compiler can't always prove that the value is of a good - integer type. These declarations are best added during tuning, when - an efficiency note indicates the need. -\end{itemize} - - -%% -%%\node Type Inference, Source Optimization, More About Types in Python, Advanced Compiler Use and Efficiency Hints -\section{Type Inference} -\label{type-inference} -\cindex{type inference} -\cindex{inference of types} -\cindex{derivation of types} - -Type inference is the process by which the compiler tries to figure -out the types of expressions and variables, given an inevitable lack -of complete type information. Although \python{} does much more type -inference than most \llisp{} compilers, remember that the more precise -and comprehensive type declarations are, the more type inference will -be able to do. - -\begin{comment} -* Variable Type Inference:: -* Local Function Type Inference:: -* Global Function Type Inference:: -* Operation Specific Type Inference:: -* Dynamic Type Inference:: -* Type Check Optimization:: -\end{comment} - -%%\node Variable Type Inference, Local Function Type Inference, Type Inference, Type Inference -\subsection{Variable Type Inference} -\label{variable-type-inference} - -The type of a variable is the union of the types of all the -definitions. In the degenerate case of a let, the type of the -variable is the type of the initial value. This inferred type is -intersected with any declared type, and is then propagated to all the -variable's references. The types of \findexed{multiple-value-bind} -variables are similarly inferred from the types of the individual -values of the values form. - -If multiple type declarations apply to a single variable, then all the -declarations must be correct; it is as though all the types were intersected -producing a single \tindexed{and} type specifier. In this example: -\begin{example} -(defmacro my-dotimes ((var count) &body body) - `(do ((,var 0 (1+ ,var))) - ((>= ,var ,count)) - (declare (type (integer 0 *) ,var)) - ,@body)) - -(my-dotimes (i ...) - (declare (fixnum i)) - ...) -\end{example} -the two declarations for \code{i} are intersected, so \code{i} is -known to be a non-negative fixnum. - -In practice, this type inference is limited to lets and local -functions, since the compiler can't analyze all the calls to a global -function. But type inference works well enough on local variables so -that it is often unnecessary to declare the type of local variables. -This is especially likely when function result types and structure -slot types are declared. The main areas where type inference breaks -down are: -\begin{itemize} - -\item When the initial value of a variable is a untyped expression, - such as \code{\w{(car x)}}, and - -\item When the type of one of the variable's definitions is a function - of the variable's current value, as in: \code{(setq x (1+ x))} -\end{itemize} - - -%%\node Local Function Type Inference, Global Function Type Inference, Variable Type Inference, Type Inference -\subsection{Local Function Type Inference} -\cpsubindex{local call}{type inference} - -The types of arguments to local functions are inferred in the same was -as any other local variable; the type is the union of the argument -types across all the calls to the function, intersected with the -declared type. If there are any assignments to the argument -variables, the type of the assigned value is unioned in as well. - -The result type of a local function is computed in a special way that -takes tail recursion (\pxlref{tail-recursion}) into consideration. -The result type is the union of all possible return values that aren't -tail-recursive calls. For example, \python{} will infer that the -result type of this function is \code{integer}: -\begin{lisp} -(defun ! (n res) - (declare (integer n res)) - (if (zerop n) - res - (! (1- n) (* n res)))) -\end{lisp} -Although this is a rather obvious result, it becomes somewhat less -trivial in the presence of mutual tail recursion of multiple -functions. Local function result type inference interacts with the -mechanisms for ensuring proper tail recursion mentioned in section -\ref{local-call-return}. - -%%\node Global Function Type Inference, Operation Specific Type Inference, Local Function Type Inference, Type Inference -\subsection{Global Function Type Inference} -\label{function-type-inference} -\cpsubindex{function}{type inference} - -As described in section \ref{function-types}, a global function type -(\tindexed{ftype}) declaration places implicit type assertions on the -call arguments, and also guarantees the type of the return value. So -wherever a call to a declared function appears, there is no doubt as -to the types of the arguments and return value. Furthermore, -\python{} will infer a function type from the function's definition if -there is no \code{ftype} declaration. Any type declarations on the -argument variables are used as the argument types in the derived -function type, and the compiler's best guess for the result type of -the function is used as the result type in the derived function type. - -This method of deriving function types from the definition implicitly assumes -that functions won't be redefined at run-time. Consider this example: -\begin{lisp} -(defun foo-p (x) - (let ((res (and (consp x) (eq (car x) 'foo)))) - (format t "It is ~:[not ~;~]foo." res))) - -(defun frob (it) - (if (foo-p it) - (setf (cadr it) 'yow!) - (1+ it))) -\end{lisp} - -Presumably, the programmer really meant to return \code{res} from -\code{foo-p}, but he seems to have forgotten. When he tries to call -do \code{\w{(frob (list 'foo nil))}}, \code{frob} will flame out when -it tries to add to a \code{cons}. Realizing his error, he fixes -\code{foo-p} and recompiles it. But when he retries his test case, he -is baffled because the error is still there. What happened in this -example is that \python{} proved that the result of \code{foo-p} is -\code{null}, and then proceeded to optimize away the \code{setf} in -\code{frob}. - -Fortunately, in this example, the error is detected at compile time -due to notes about unreachable code (\pxlref{dead-code-notes}.) -Still, some users may not want to worry about this sort of problem -during incremental development, so there is a variable to control -deriving function types. - -\begin{defvar}{extensions:}{derive-function-types} - - If true (the default), argument and result type information derived - from compilation of \code{defun}s is used when compiling calls to - that function. If false, only information from \code{ftype} - proclamations will be used. -\end{defvar} - -%%\node Operation Specific Type Inference, Dynamic Type Inference, Global Function Type Inference, Type Inference -\subsection{Operation Specific Type Inference} -\label{operation-type-inference} -\cindex{operation specific type inference} -\cindex{arithmetic type inference} -\cpsubindex{numeric}{type inference} - -Many of the standard \clisp{} functions have special type inference -procedures that determine the result type as a function of the -argument types. For example, the result type of \code{aref} is the -array element type. Here are some other examples of type inferences: -\begin{lisp} -(logand x #xFF) \result{} (unsigned-byte 8) - -(+ (the (integer 0 12) x) (the (integer 0 1) y)) \result{} (integer 0 13) - -(ash (the (unsigned-byte 16) x) -8) \result{} (unsigned-byte 8) -\end{lisp} - -%%\node Dynamic Type Inference, Type Check Optimization, Operation Specific Type Inference, Type Inference -\subsection{Dynamic Type Inference} -\label{constraint-propagation} -\cindex{dynamic type inference} -\cindex{conditional type inference} -\cpsubindex{type inference}{dynamic} - -Python uses flow analysis to infer types in dynamically typed -programs. For example: -\begin{example} -(ecase x - (list (length x)) - ...) -\end{example} -Here, the compiler knows the argument to \code{length} is a list, -because the call to \code{length} is only done when \code{x} is a -list. The most significant efficiency effect of inference from -assertions is usually in type check optimization. - - -Dynamic type inference has two inputs: explicit conditionals and -implicit or explicit type assertions. Flow analysis propagates these -constraints on variable type to any code that can be executed only -after passing though the constraint. Explicit type constraints come -from \findexed{if}s where the test is either a lexical variable or a -function of lexical variables and constants, where the function is -either a type predicate, a numeric comparison or \code{eq}. - -If there is an \code{eq} (or \code{eql}) test, then the compiler will -actually substitute one argument for the other in the true branch. -For example: -\begin{lisp} -(when (eq x :yow!) (return x)) -\end{lisp} -becomes: -\begin{lisp} -(when (eq x :yow!) (return :yow!)) -\end{lisp} -This substitution is done when one argument is a constant, or one -argument has better type information than the other. This -transformation reveals opportunities for constant folding or -type-specific optimizations. If the test is against a constant, then -the compiler can prove that the variable is not that constant value in -the false branch, or \w{\code{(not (member :yow!))}} in the example -above. This can eliminate redundant tests, for example: -\begin{example} -(if (eq x nil) - ... - (if x a b)) -\end{example} -is transformed to this: -\begin{example} -(if (eq x nil) - ... - a) -\end{example} -Variables appearing as \code{if} tests are interpreted as -\code{\w{(not (eq \var{var} nil))}} tests. The compiler also converts -\code{=} into \code{eql} where possible. It is difficult to do -inference directly on \code{=} since it does implicit coercions. - -When there is an explicit \code{$<$} or \code{$>$} test on -\begin{changebar} - numeric -\end{changebar} -variables, the compiler makes inferences about the ranges the -variables can assume in the true and false branches. This is mainly -useful when it proves that the values are small enough in magnitude to -allow open-coding of arithmetic operations. For example, in many uses -of \code{dotimes} with a \code{fixnum} repeat count, the compiler -proves that fixnum arithmetic can be used. - -Implicit type assertions are quite common, especially if you declare -function argument types. Dynamic inference from implicit type -assertions sometimes helps to disambiguate programs to a useful -degree, but is most noticeable when it detects a dynamic type error. -For example: -\begin{lisp} -(defun foo (x) - (+ (car x) x)) -\end{lisp} -results in this warning: -\begin{example} -In: DEFUN FOO - (+ (CAR X) X) -==> - X -Warning: Result is a LIST, not a NUMBER. -\end{example} - -Note that \llisp{}'s dynamic type checking semantics make dynamic type -inference useful even in programs that aren't really dynamically -typed, for example: -\begin{lisp} -(+ (car x) (length x)) -\end{lisp} -Here, \code{x} presumably always holds a list, but in the absence of a -declaration the compiler cannot assume \code{x} is a list simply -because list-specific operations are sometimes done on it. The -compiler must consider the program to be dynamically typed until it -proves otherwise. Dynamic type inference proves that the argument to -\code{length} is always a list because the call to \code{length} is -only done after the list-specific \code{car} operation. - - -%%\node Type Check Optimization, , Dynamic Type Inference, Type Inference -\subsection{Type Check Optimization} -\label{type-check-optimization} -\cpsubindex{type checking}{optimization} -\cpsubindex{optimization}{type check} - -Python backs up its support for precise type checking by minimizing -the cost of run-time type checking. This is done both through type -inference and though optimizations of type checking itself. - -Type inference often allows the compiler to prove that a value is of -the correct type, and thus no type check is necessary. For example: -\begin{lisp} -(defstruct foo a b c) -(defstruct link - (foo (required-argument) :type foo) - (next nil :type (or link null))) - -(foo-a (link-foo x)) -\end{lisp} -Here, there is no need to check that the result of \code{link-foo} is -a \code{foo}, since it always is. Even when some type checks are -necessary, type inference can often reduce the number: -\begin{example} -(defun test (x) - (let ((a (foo-a x)) - (b (foo-b x)) - (c (foo-c x))) - ...)) -\end{example} -In this example, only one \w{\code{(foo-p x)}} check is needed. This -applies to a lesser degree in list operations, such as: -\begin{lisp} -(if (eql (car x) 3) (cdr x) y) -\end{lisp} -Here, we only have to check that \code{x} is a list once. - -Since \python{} recognizes explicit type tests, code that explicitly -protects itself against type errors has little introduced overhead due -to implicit type checking. For example, this loop compiles with no -implicit checks checks for \code{car} and \code{cdr}: -\begin{lisp} -(defun memq (e l) - (do ((current l (cdr current))) - ((atom current) nil) - (when (eq (car current) e) (return current)))) -\end{lisp} - -\cindex{complemented type checks} -Python reduces the cost of checks that must be done through an -optimization called \var{complementing}. A complemented check for -\var{type} is simply a check that the value is not of the type -\w{\code{(not \var{type})}}. This is only interesting when something -is known about the actual type, in which case we can test for the -complement of \w{\code{(and \var{known-type} (not \var{type}))}}, or -the difference between the known type and the assertion. An example: -\begin{lisp} -(link-foo (link-next x)) -\end{lisp} -Here, we change the type check for \code{link-foo} from a test for -\code{foo} to a test for: -\begin{lisp} -(not (and (or foo null) (not foo))) -\end{lisp} -or more simply \w{\code{(not null)}}. This is probably the most -important use of complementing, since the situation is fairly common, -and a \code{null} test is much cheaper than a structure type test. - -Here is a more complicated example that illustrates the combination of -complementing with dynamic type inference: -\begin{lisp} -(defun find-a (a x) - (declare (type (or link null) x)) - (do ((current x (link-next current))) - ((null current) nil) - (let ((foo (link-foo current))) - (when (eq (foo-a foo) a) (return foo))))) -\end{lisp} -This loop can be compiled with no type checks. The \code{link} test -for \code{link-foo} and \code{link-next} is complemented to -\w{\code{(not null)}}, and then deleted because of the explicit -\code{null} test. As before, no check is necessary for \code{foo-a}, -since the \code{link-foo} is always a \code{foo}. This sort of -situation shows how precise type checking combined with precise -declarations can actually result in reduced type checking. - -%% -%%\node Source Optimization, Tail Recursion, Type Inference, Advanced Compiler Use and Efficiency Hints -\section{Source Optimization} -\label{source-optimization} -\cindex{optimization} - -This section describes source-level transformations that \python{} does on -programs in an attempt to make them more efficient. Although source-level -optimizations can make existing programs more efficient, the biggest advantage -of this sort of optimization is that it makes it easier to write efficient -programs. If a clean, straightforward implementation is can be transformed -into an efficient one, then there is no need for tricky and dangerous hand -optimization. - -\begin{comment} -* Let Optimization:: -* Constant Folding:: -* Unused Expression Elimination:: -* Control Optimization:: -* Unreachable Code Deletion:: -* Multiple Values Optimization:: -* Source to Source Transformation:: -* Style Recommendations:: -\end{comment} - -%%\node Let Optimization, Constant Folding, Source Optimization, Source Optimization -\subsection{Let Optimization} -\label{let-optimization} - -\cindex{let optimization} \cpsubindex{optimization}{let} - -The primary optimization of let variables is to delete them when they -are unnecessary. Whenever the value of a let variable is a constant, -a constant variable or a constant (local or non-notinline) function, -the variable is deleted, and references to the variable are replaced -with references to the constant expression. This is useful primarily -in the expansion of macros or inline functions, where argument values -are often constant in any given call, but are in general non-constant -expressions that must be bound to preserve order of evaluation. Let -variable optimization eliminates the need for macros to carefully -avoid spurious bindings, and also makes inline functions just as -efficient as macros. - -A particularly interesting class of constant is a local function. -Substituting for lexical variables that are bound to a function can -substantially improve the efficiency of functional programming styles, -for example: -\begin{lisp} -(let ((a #'(lambda (x) (zow x)))) - (funcall a 3)) -\end{lisp} -effectively transforms to: -\begin{lisp} -(zow 3) -\end{lisp} -This transformation is done even when the function is a closure, as in: -\begin{lisp} -(let ((a (let ((y (zug))) - #'(lambda (x) (zow x y))))) - (funcall a 3)) -\end{lisp} -becoming: -\begin{lisp} -(zow 3 (zug)) -\end{lisp} - -A constant variable is a lexical variable that is never assigned to, -always keeping its initial value. Whenever possible, avoid setting -lexical variables\dash{}instead bind a new variable to the new value. -Except for loop variables, it is almost always possible to avoid -setting lexical variables. This form: -\begin{example} -(let ((x (f x))) - ...) -\end{example} -is \var{more} efficient than this form: -\begin{example} -(setq x (f x)) -... -\end{example} -Setting variables makes the program more difficult to understand, both -to the compiler and to the programmer. \python{} compiles assignments -at least as efficiently as any other \llisp{} compiler, but most let -optimizations are only done on constant variables. - -Constant variables with only a single use are also optimized away, -even when the initial value is not constant.\footnote{The source - transformation in this example doesn't represent the preservation of - evaluation order implicit in the compiler's internal representation. - Where necessary, the back end will reintroduce temporaries to - preserve the semantics.} For example, this expansion of -\code{incf}: -\begin{lisp} -(let ((#:g3 (+ x 1))) - (setq x #:G3)) -\end{lisp} -becomes: -\begin{lisp} -(setq x (+ x 1)) -\end{lisp} -The type semantics of this transformation are more important than the -elimination of the variable itself. Consider what happens when -\code{x} is declared to be a \code{fixnum}; after the transformation, -the compiler can compile the addition knowing that the result is a -\code{fixnum}, whereas before the transformation the addition would -have to allow for fixnum overflow. - -Another variable optimization deletes any variable that is never read. -This causes the initial value and any assigned values to be unused, -allowing those expressions to be deleted if they have no side-effects. - -Note that a let is actually a degenerate case of local call -(\pxlref{let-calls}), and that let optimization can be done on calls -that weren't created by a let. Also, local call allows an applicative -style of iteration that is totally assignment free. - -%%\node Constant Folding, Unused Expression Elimination, Let Optimization, Source Optimization -\subsection{Constant Folding} -\cindex{constant folding} -\cpsubindex{folding}{constant} - -Constant folding is an optimization that replaces a call of constant -arguments with the constant result of that call. Constant folding is -done on all standard functions for which it is legal. Inline -expansion allows folding of any constant parts of the definition, and -can be done even on functions that have side-effects. - -It is convenient to rely on constant folding when programming, as in this -example: -\begin{example} -(defconstant limit 42) - -(defun foo () - (... (1- limit) ...)) -\end{example} -Constant folding is also helpful when writing macros or inline -functions, since it usually eliminates the need to write a macro that -special-cases constant arguments. - -\cindex{constant-function declaration} Constant folding of a user -defined function is enabled by the \code{extensions:constant-function} -proclamation. In this example: -\begin{example} -(declaim (ext:constant-function myfun)) -(defun myexp (x y) - (declare (single-float x y)) - (exp (* (log x) y))) - - ... (myexp 3.0 1.3) ... -\end{example} -The call to \code{myexp} is constant-folded to \code{4.1711674}. - - -%%\node Unused Expression Elimination, Control Optimization, Constant Folding, Source Optimization -\subsection{Unused Expression Elimination} -\cindex{unused expression elimination} -\cindex{dead code elimination} - -If the value of any expression is not used, and the expression has no -side-effects, then it is deleted. As with constant folding, this -optimization applies most often when cleaning up after inline -expansion and other optimizations. Any function declared an -\code{extensions:constant-function} is also subject to unused -expression elimination. - -Note that \python{} will eliminate parts of unused expressions known -to be side-effect free, even if there are other unknown parts. For -example: -\begin{lisp} -(let ((a (list (foo) (bar)))) - (if t - (zow) - (raz a))) -\end{lisp} -becomes: -\begin{lisp} -(progn (foo) (bar)) -(zow) -\end{lisp} - - -%%\node Control Optimization, Unreachable Code Deletion, Unused Expression Elimination, Source Optimization -\subsection{Control Optimization} -\cindex{control optimization} -\cpsubindex{optimization}{control} - -The most important optimization of control is recognizing when an -\findexed{if} test is known at compile time, then deleting the -\code{if}, the test expression, and the unreachable branch of the -\code{if}. This can be considered a special case of constant folding, -although the test doesn't have to be truly constant as long as it is -definitely not \false. Note also, that type inference propagates the -result of an \code{if} test to the true and false branches, -\pxlref{constraint-propagation}. - -A related \code{if} optimization is this transformation:\footnote{Note - that the code for \code{x} and \code{y} isn't actually replicated.} -\begin{lisp} -(if (if a b c) x y) -\end{lisp} -into: -\begin{lisp} -(if a - (if b x y) - (if c x y)) -\end{lisp} -The opportunity for this sort of optimization usually results from a -conditional macro. For example: -\begin{lisp} -(if (not a) x y) -\end{lisp} -is actually implemented as this: -\begin{lisp} -(if (if a nil t) x y) -\end{lisp} -which is transformed to this: -\begin{lisp} -(if a - (if nil x y) - (if t x y)) -\end{lisp} -which is then optimized to this: -\begin{lisp} -(if a y x) -\end{lisp} -Note that due to \python{}'s internal representations, the -\code{if}\dash{}\code{if} situation will be recognized even if other -forms are wrapped around the inner \code{if}, like: -\begin{example} -(if (let ((g ...)) - (loop - ... - (return (not g)) - ...)) - x y) -\end{example} - -In \python, all the \clisp{} macros really are macros, written in -terms of \code{if}, \code{block} and \code{tagbody}, so user-defined -control macros can be just as efficient as the standard ones. -\python{} emits basic blocks using a heuristic that minimizes the -number of unconditional branches. The code in a \code{tagbody} will -not be emitted in the order it appeared in the source, so there is no -point in arranging the code to make control drop through to the -target. - -%%\node Unreachable Code Deletion, Multiple Values Optimization, Control Optimization, Source Optimization -\subsection{Unreachable Code Deletion} -\label{dead-code-notes} -\cindex{unreachable code deletion} -\cindex{dead code elimination} - -Python will delete code whenever it can prove that the code can never be -executed. Code becomes unreachable when: -\begin{itemize} - -\item -An \code{if} is optimized away, or - -\item -There is an explicit unconditional control transfer such as \code{go} or -\code{return-from}, or - -\item -The last reference to a local function is deleted (or there never was any -reference.) -\end{itemize} - - -When code that appeared in the original source is deleted, the compiler prints -a note to indicate a possible problem (or at least unnecessary code.) For -example: -\begin{lisp} -(defun foo () - (if t - (write-line "True.") - (write-line "False."))) -\end{lisp} -will result in this note: -\begin{example} -In: DEFUN FOO - (WRITE-LINE "False.") -Note: Deleting unreachable code. -\end{example} - -It is important to pay attention to unreachable code notes, since they often -indicate a subtle type error. For example: -\begin{example} -(defstruct foo a b) - -(defun lose (x) - (let ((a (foo-a x)) - (b (if x (foo-b x) :none))) - ...)) -\end{example} -results in this note: -\begin{example} -In: DEFUN LOSE - (IF X (FOO-B X) :NONE) -==> - :NONE -Note: Deleting unreachable code. -\end{example} -The \kwd{none} is unreachable, because type inference knows that the argument -to \code{foo-a} must be a \code{foo}, and thus can't be \false. Presumably the -programmer forgot that \code{x} could be \false{} when he wrote the binding for -\code{a}. - -Here is an example with an incorrect declaration: -\begin{lisp} -(defun count-a (string) - (do ((pos 0 (position #\back{a} string :start (1+ pos))) - (count 0 (1+ count))) - ((null pos) count) - (declare (fixnum pos)))) -\end{lisp} -This time our note is: -\begin{example} -In: DEFUN COUNT-A - (DO ((POS 0 #) (COUNT 0 #)) - ((NULL POS) COUNT) - (DECLARE (FIXNUM POS))) ---> BLOCK LET TAGBODY RETURN-FROM PROGN -==> - COUNT -Note: Deleting unreachable code. -\end{example} -The problem here is that \code{pos} can never be null since it is declared a -\code{fixnum}. - -It takes some experience with unreachable code notes to be able to -tell what they are trying to say. In non-obvious cases, the best -thing to do is to call the function in a way that should cause the -unreachable code to be executed. Either you will get a type error, or -you will find that there truly is no way for the code to be executed. - -Not all unreachable code results in a note: -\begin{itemize} - -\item A note is only given when the unreachable code textually appears - in the original source. This prevents spurious notes due to the - optimization of macros and inline functions, but sometimes also - foregoes a note that would have been useful. - -\item Since accurate source information is not available for non-list - forms, there is an element of heuristic in determining whether or - not to give a note about an atom. Spurious notes may be given when - a macro or inline function defines a variable that is also present - in the calling function. Notes about \false{} and \true{} are never - given, since it is too easy to confuse these constants in expanded - code with ones in the original source. - -\item Notes are only given about code unreachable due to control flow. - There is no note when an expression is deleted because its value is - unused, since this is a common consequence of other optimizations. -\end{itemize} - - -Somewhat spurious unreachable code notes can also result when a macro -inserts multiple copies of its arguments in different contexts, for -example: -\begin{lisp} -(defmacro t-and-f (var form) - `(if ,var ,form ,form)) - -(defun foo (x) - (t-and-f x (if x "True." "False."))) -\end{lisp} -results in these notes: -\begin{example} -In: DEFUN FOO - (IF X "True." "False.") -==> - "False." -Note: Deleting unreachable code. - -==> - "True." -Note: Deleting unreachable code. -\end{example} -It seems like it has deleted both branches of the \code{if}, but it has really -deleted one branch in one copy, and the other branch in the other copy. Note -that these messages are only spurious in not satisfying the intent of the rule -that notes are only given when the deleted code appears in the original source; -there is always \var{some} code being deleted when a unreachable code note is -printed. - - -%%\node Multiple Values Optimization, Source to Source Transformation, Unreachable Code Deletion, Source Optimization -\subsection{Multiple Values Optimization} -\cindex{multiple value optimization} -\cpsubindex{optimization}{multiple value} - -Within a function, \python{} implements uses of multiple values -particularly efficiently. Multiple values can be kept in arbitrary -registers, so using multiple values doesn't imply stack manipulation -and representation conversion. For example, this code: -\begin{example} -(let ((a (if x (foo x) u)) - (b (if x (bar x) v))) - ...) -\end{example} -is actually more efficient written this way: -\begin{example} -(multiple-value-bind - (a b) - (if x - (values (foo x) (bar x)) - (values u v)) - ...) -\end{example} - -Also, \pxlref{local-call-return} for information on how local call -provides efficient support for multiple function return values. - - -%%\node Source to Source Transformation, Style Recommendations, Multiple Values Optimization, Source Optimization -\subsection{Source to Source Transformation} -\cindex{source-to-source transformation} -\cpsubindex{transformation}{source-to-source} - -The compiler implements a number of operation-specific optimizations as -source-to-source transformations. You will often see unfamiliar code in error -messages, for example: -\begin{lisp} -(defun my-zerop () (zerop x)) -\end{lisp} -gives this warning: -\begin{example} -In: DEFUN MY-ZEROP - (ZEROP X) -==> - (= X 0) -Warning: Undefined variable: X -\end{example} -The original \code{zerop} has been transformed into a call to -\code{=}. This transformation is indicated with the same \code{==$>$} -used to mark macro and function inline expansion. Although it can be -confusing, display of the transformed source is important, since -warnings are given with respect to the transformed source. This a -more obscure example: -\begin{lisp} -(defun foo (x) (logand 1 x)) -\end{lisp} -gives this efficiency note: -\begin{example} -In: DEFUN FOO - (LOGAND 1 X) -==> - (LOGAND C::Y C::X) -Note: Forced to do static-function Two-arg-and (cost 53). - Unable to do inline fixnum arithmetic (cost 1) because: - The first argument is a INTEGER, not a FIXNUM. - etc. -\end{example} -Here, the compiler commuted the call to \code{logand}, introducing -temporaries. The note complains that the \var{first} argument is not -a \code{fixnum}, when in the original call, it was the second -argument. To make things more confusing, the compiler introduced -temporaries called \code{c::x} and \code{c::y} that are bound to -\code{y} and \code{1}, respectively. - -You will also notice source-to-source optimizations when efficiency -notes are enabled (\pxlref{efficiency-notes}.) When the compiler is -unable to do a transformation that might be possible if there was more -information, then an efficiency note is printed. For example, -\code{my-zerop} above will also give this efficiency note: -\begin{example} -In: DEFUN FOO - (ZEROP X) -==> - (= X 0) -Note: Unable to optimize because: - Operands might not be the same type, so can't open code. -\end{example} - -%%\node Style Recommendations, , Source to Source Transformation, Source Optimization -\subsection{Style Recommendations} -\cindex{style recommendations} - -Source level optimization makes possible a clearer and more relaxed programming -style: -\begin{itemize} - -\item Don't use macros purely to avoid function call. If you want an - inline function, write it as a function and declare it inline. It's - clearer, less error-prone, and works just as well. - -\item Don't write macros that try to ``optimize'' their expansion in - trivial ways such as avoiding binding variables for simple - expressions. The compiler does these optimizations too, and is less - likely to make a mistake. - -\item Make use of local functions (i.e., \code{labels} or \code{flet}) - and tail-recursion in places where it is clearer. Local function - call is faster than full call. - -\item Avoid setting local variables when possible. Binding a new - \code{let} variable is at least as efficient as setting an existing - variable, and is easier to understand, both for the compiler and the - programmer. - -\item Instead of writing similar code over and over again so that it - can be hand customized for each use, define a macro or inline - function, and let the compiler do the work. -\end{itemize} - - -%% -%%\node Tail Recursion, Local Call, Source Optimization, Advanced Compiler Use and Efficiency Hints -\section{Tail Recursion} -\label{tail-recursion} -\cindex{tail recursion} -\cindex{recursion} - -A call is tail-recursive if nothing has to be done after the the call -returns, i.e. when the call returns, the returned value is immediately -returned from the calling function. In this example, the recursive -call to \code{myfun} is tail-recursive: -\begin{lisp} -(defun myfun (x) - (if (oddp (random x)) - (isqrt x) - (myfun (1- x)))) -\end{lisp} - -Tail recursion is interesting because it is form of recursion that can be -implemented much more efficiently than general recursion. In general, a -recursive call requires the compiler to allocate storage on the stack at -run-time for every call that has not yet returned. This memory consumption -makes recursion unacceptably inefficient for representing repetitive algorithms -having large or unbounded size. Tail recursion is the special case of -recursion that is semantically equivalent to the iteration constructs normally -used to represent repetition in programs. Because tail recursion is equivalent -to iteration, tail-recursive programs can be compiled as efficiently as -iterative programs. - -So why would you want to write a program recursively when you can write it -using a loop? Well, the main answer is that recursion is a more general -mechanism, so it can express some solutions simply that are awkward to write as -a loop. Some programmers also feel that recursion is a stylistically -preferable way to write loops because it avoids assigning variables. -For example, instead of writing: -\begin{lisp} -(defun fun1 (x) - something-that-uses-x) - -(defun fun2 (y) - something-that-uses-y) - -(do ((x something (fun2 (fun1 x)))) - (nil)) -\end{lisp} -You can write: -\begin{lisp} -(defun fun1 (x) - (fun2 something-that-uses-x)) - -(defun fun2 (y) - (fun1 something-that-uses-y)) - -(fun1 something) -\end{lisp} -The tail-recursive definition is actually more efficient, in addition to being -(arguably) clearer. As the number of functions and the complexity of their -call graph increases, the simplicity of using recursion becomes compelling. -Consider the advantages of writing a large finite-state machine with separate -tail-recursive functions instead of using a single huge \code{prog}. - -It helps to understand how to use tail recursion if you think of a -tail-recursive call as a \code{psetq} that assigns the argument values to the -called function's variables, followed by a \code{go} to the start of the called -function. This makes clear an inherent efficiency advantage of tail-recursive -call: in addition to not having to allocate a stack frame, there is no need to -prepare for the call to return (e.g., by computing a return PC.) - -Is there any disadvantage to tail recursion? Other than an increase -in efficiency, the only way you can tell that a call has been compiled -tail-recursively is if you use the debugger. Since a tail-recursive -call has no stack frame, there is no way the debugger can print out -the stack frame representing the call. The effect is that backtrace -will not show some calls that would have been displayed in a -non-tail-recursive implementation. In practice, this is not as bad as -it sounds\dash{}in fact it isn't really clearly worse, just different. -\xlref{debug-tail-recursion} for information about the debugger -implications of tail recursion. - -In order to ensure that tail-recursion is preserved in arbitrarily -complex calling patterns across separately compiled functions, the -compiler must compile any call in a tail-recursive position as a -tail-recursive call. This is done regardless of whether the program -actually exhibits any sort of recursive calling pattern. In this -example, the call to \code{fun2} will always be compiled as a -tail-recursive call: -\begin{lisp} -(defun fun1 (x) - (fun2 x)) -\end{lisp} -So tail recursion doesn't necessarily have anything to do with recursion -as it is normally thought of. \xlref{local-tail-recursion} for more -discussion of using tail recursion to implement loops. - -\begin{comment} -* Tail Recursion Exceptions:: -\end{comment} - -%%\node Tail Recursion Exceptions, , Tail Recursion, Tail Recursion -\subsection{Tail Recursion Exceptions} - -Although \python{} is claimed to be ``properly'' tail-recursive, some -might dispute this, since there are situations where tail recursion is -inhibited: -\begin{itemize} - -\item When the call is enclosed by a special binding, or - -\item When the call is enclosed by a \code{catch} or - \code{unwind-protect}, or - -\item When the call is enclosed by a \code{block} or \code{tagbody} - and the block name or \code{go} tag has been closed over. -\end{itemize} -These dynamic extent binding forms inhibit tail recursion because they -allocate stack space to represent the binding. Shallow-binding -implementations of dynamic scoping also require cleanup code to be -evaluated when the scope is exited. - -%% -%%\node Local Call, Block Compilation, Tail Recursion, Advanced Compiler Use and Efficiency Hints -\section{Local Call} -\label{local-call} -\cindex{local call} -\cpsubindex{call}{local} -\cpsubindex{function call}{local} - -Python supports two kinds of function call: full call and local call. -Full call is the standard calling convention; its late binding and -generality make \llisp{} what it is, but create unavoidable overheads. -When the compiler can compile the calling function and the called -function simultaneously, it can use local call to avoid some of the -overhead of full call. Local call is really a collection of -compilation strategies. If some aspect of call overhead is not needed -in a particular local call, then it can be omitted. In some cases, -local call can be totally free. Local call provides two main -advantages to the user: -\begin{itemize} - -\item Local call makes the use of the lexical function binding forms - \findexed{flet} and \findexed{labels} much more efficient. A local - call is always faster than a full call, and in many cases is much - faster. - -\item Local call is a natural approach to \i{block compilation}, a - compilation technique that resolves function references at compile - time. Block compilation speeds function call, but increases - compilation times and prevents function redefinition. -\end{itemize} - - -\begin{comment} -* Self-Recursive Calls:: -* Let Calls:: -* Closures:: -* Local Tail Recursion:: -* Return Values:: -\end{comment} - -%%\node Self-Recursive Calls, Let Calls, Local Call, Local Call -\subsection{Self-Recursive Calls} -\cpsubindex{recursion}{self} - -Local call is used when a function defined by \code{defun} calls itself. For -example: -\begin{lisp} -(defun fact (n) - (if (zerop n) - 1 - (* n (fact (1- n))))) -\end{lisp} -This use of local call speeds recursion, but can also complicate -debugging, since \findexed{trace} will only show the first call to -\code{fact}, and not the recursive calls. This is because the -recursive calls directly jump to the start of the function, and don't -indirect through the \code{symbol-function}. Self-recursive local -call is inhibited when the \kwd{block-compile} argument to -\code{compile-file} is \false{} (\pxlref{compile-file-block}.) - -%%\node Let Calls, Closures, Self-Recursive Calls, Local Call -\subsection{Let Calls} -\label{let-calls} -Because local call avoids unnecessary call overheads, the compiler -internally uses local call to implement some macros and special forms -that are not normally thought of as involving a function call. For -example, this \code{let}: -\begin{example} -(let ((a (foo)) - (b (bar))) - ...) -\end{example} -is internally represented as though it was macroexpanded into: -\begin{example} -(funcall #'(lambda (a b) - ...) - (foo) - (bar)) -\end{example} -This implementation is acceptable because the simple cases of local -call (equivalent to a \code{let}) result in good code. This doesn't -make \code{let} any more efficient, but does make local calls that are -semantically the same as \code{let} much more efficient than full -calls. For example, these definitions are all the same as far as the -compiler is concerned: -\begin{example} -(defun foo () - ...some other stuff... - (let ((a something)) - ...some stuff...)) - -(defun foo () - (flet ((localfun (a) - ...some stuff...)) - ...some other stuff... - (localfun something))) - -(defun foo () - (let ((funvar #'(lambda (a) - ...some stuff...))) - ...some other stuff... - (funcall funvar something))) -\end{example} - -Although local call is most efficient when the function is called only -once, a call doesn't have to be equivalent to a \code{let} to be more -efficient than full call. All local calls avoid the overhead of -argument count checking and keyword argument parsing, and there are a -number of other advantages that apply in many common situations. -\xlref{let-optimization} for a discussion of the optimizations done on -let calls. - -%%\node Closures, Local Tail Recursion, Let Calls, Local Call -\subsection{Closures} -\cindex{closures} - -Local call allows for much more efficient use of closures, since the -closure environment doesn't need to be allocated on the heap, or even -stored in memory at all. In this example, there is no penalty for -\code{localfun} referencing \code{a} and \code{b}: -\begin{lisp} -(defun foo (a b) - (flet ((localfun (x) - (1+ (* a b x)))) - (if (= a b) - (localfun (- x)) - (localfun x)))) -\end{lisp} -In local call, the compiler effectively passes closed-over values as -extra arguments, so there is no need for you to ``optimize'' local -function use by explicitly passing in lexically visible values. -Closures may also be subject to let optimization -(\pxlref{let-optimization}.) - -Note: indirect value cells are currently always allocated on the heap -when a variable is both assigned to (with \code{setq} or \code{setf}) -and closed over, regardless of whether the closure is a local function -or not. This is another reason to avoid setting variables when you -don't have to. - -%%\node Local Tail Recursion, Return Values, Closures, Local Call -\subsection{Local Tail Recursion} -\label{local-tail-recursion} -\cindex{tail recursion} -\cpsubindex{recursion}{tail} - -Tail-recursive local calls are particularly efficient, since they are -in effect an assignment plus a control transfer. Scheme programmers -write loops with tail-recursive local calls, instead of using the -imperative \code{go} and \code{setq}. This has not caught on in the -\clisp{} community, since conventional \llisp{} compilers don't -implement local call. In \python, users can choose to write loops -such as: -\begin{lisp} -(defun ! (n) - (labels ((loop (n total) - (if (zerop n) - total - (loop (1- n) (* n total))))) - (loop n 1))) -\end{lisp} - -\begin{defmac}{extensions:}{iterate}{% - \args{\var{name} (\mstar{(\var{var} \var{initial-value})}) - \mstar{\var{declaration}} \mstar{\var{form}}}} - - This macro provides syntactic sugar for using \findexed{labels} to - do iteration. It creates a local function \var{name} with the - specified \var{var}s as its arguments and the \var{declaration}s and - \var{form}s as its body. This function is then called with the - \var{initial-values}, and the result of the call is return from the - macro. - - Here is our factorial example rewritten using \code{iterate}: - - \begin{lisp} - (defun ! (n) - (iterate loop - ((n n) - (total 1)) - (if (zerop n) - total - (loop (1- n) (* n total))))) - \end{lisp} - - The main advantage of using \code{iterate} over \code{do} is that - \code{iterate} naturally allows stepping to be done differently - depending on conditionals in the body of the loop. \code{iterate} - can also be used to implement algorithms that aren't really - iterative by simply doing a non-tail call. For example, the - standard recursive definition of factorial can be written like this: -\begin{lisp} -(iterate fact - ((n n)) - (if (zerop n) - 1 - (* n (fact (1- n))))) -\end{lisp} -\end{defmac} - -%%\node Return Values, , Local Tail Recursion, Local Call -\subsection{Return Values} -\label{local-call-return} -\cpsubindex{return values}{local call} -\cpsubindex{local call}{return values} - -One of the more subtle costs of full call comes from allowing -arbitrary numbers of return values. This overhead can be avoided in -local calls to functions that always return the same number of values. -For efficiency reasons (as well as stylistic ones), you should write -functions so that they always return the same number of values. This -may require passing extra \false{} arguments to \code{values} in some -cases, but the result is more efficient, not less so. - -When efficiency notes are enabled (\pxlref{efficiency-notes}), and the -compiler wants to use known values return, but can't prove that the -function always returns the same number of values, then it will print -a note like this: -\begin{example} -In: DEFUN GRUE - (DEFUN GRUE (X) (DECLARE (FIXNUM X)) (COND (# #) (# NIL) (T #))) -Note: Return type not fixed values, so can't use known return convention: - (VALUES (OR (INTEGER -536870912 -1) NULL) &REST T) -\end{example} - -In order to implement proper tail recursion in the presence of known -values return (\pxlref{tail-recursion}), the compiler sometimes must -prove that multiple functions all return the same number of values. -When this can't be proven, the compiler will print a note like this: -\begin{example} -In: DEFUN BLUE - (DEFUN BLUE (X) (DECLARE (FIXNUM X)) (COND (# #) (# #) (# #) (T #))) -Note: Return value count mismatch prevents known return from - these functions: - BLUE - SNOO -\end{example} -\xlref{number-local-call} for the interaction between local call -and the representation of numeric types. - -%% -%%\node Block Compilation, Inline Expansion, Local Call, Advanced Compiler Use and Efficiency Hints -\section{Block Compilation} -\label{block-compilation} -\cindex{block compilation} -\cpsubindex{compilation}{block} - -Block compilation allows calls to global functions defined by -\findexed{defun} to be compiled as local calls. The function call -can be in a different top-level form than the \code{defun}, or even in a -different file. - -In addition, block compilation allows the declaration of the \i{entry points} -to the block compiled portion. An entry point is any function that may be -called from outside of the block compilation. If a function is not an entry -point, then it can be compiled more efficiently, since all calls are known at -compile time. In particular, if a function is only called in one place, then -it will be let converted. This effectively inline expands the function, but -without the code duplication that results from defining the function normally -and then declaring it inline. - -The main advantage of block compilation is that it it preserves efficiency in -programs even when (for readability and syntactic convenience) they are broken -up into many small functions. There is absolutely no overhead for calling a -non-entry point function that is defined purely for modularity (i.e. called -only in one place.) - -Block compilation also allows the use of non-descriptor arguments and return -values in non-trivial programs (\pxlref{number-local-call}). - -\begin{comment} -* Block Compilation Semantics:: -* Block Compilation Declarations:: -* Compiler Arguments:: -* Practical Difficulties:: -* Context Declarations:: -* Context Declaration Example:: -\end{comment} - -%%\node Block Compilation Semantics, Block Compilation Declarations, Block Compilation, Block Compilation -\subsection{Block Compilation Semantics} - -The effect of block compilation can be envisioned as the compiler turning all -the \code{defun}s in the block compilation into a single \code{labels} form: -\begin{example} -(declaim (start-block fun1 fun3)) - -(defun fun1 () - ...) - -(defun fun2 () - ... - (fun1) - ...) - -(defun fun3 (x) - (if x - (fun1) - (fun2))) - -(declaim (end-block)) -\end{example} -becomes: -\begin{example} -(labels ((fun1 () - ...) - (fun2 () - ... - (fun1) - ...) - (fun3 (x) - (if x - (fun1) - (fun2)))) - (setf (fdefinition 'fun1) #'fun1) - (setf (fdefinition 'fun3) #'fun3)) -\end{example} -Calls between the block compiled functions are local calls, so changing the -global definition of \code{fun1} will have no effect on what \code{fun2} does; -\code{fun2} will keep calling the old \code{fun1}. - -The entry points \code{fun1} and \code{fun3} are still installed in -the \code{symbol-function} as the global definitions of the functions, -so a full call to an entry point works just as before. However, -\code{fun2} is not an entry point, so it is not globally defined. In -addition, \code{fun2} is only called in one place, so it will be let -converted. - - -%%\node Block Compilation Declarations, Compiler Arguments, Block Compilation Semantics, Block Compilation -\subsection{Block Compilation Declarations} -\cpsubindex{declarations}{block compilation} -\cindex{start-block declaration} -\cindex{end-block declaration} - -The \code{extensions:start-block} and \code{extensions:end-block} -declarations allow fine-grained control of block compilation. These -declarations are only legal as a global declarations (\code{declaim} -or \code{proclaim}). - -\noindent -\vspace{1 em} -The \code{start-block} declaration has this syntax: -\begin{example} -(start-block \mstar{\var{entry-point-name}}) -\end{example} -When processed by the compiler, this declaration marks the start of -block compilation, and specifies the entry points to that block. If -no entry points are specified, then \var{all} functions are made into -entry points. If already block compiling, then the compiler ends the -current block and starts a new one. - -\noindent -\vspace{1 em} -The \code{end-block} declaration has no arguments: -\begin{lisp} -(end-block) -\end{lisp} -The \code{end-block} declaration ends a block compilation unit without -starting a new one. This is useful mainly when only a portion of a file -is worth block compiling. - -%%\node Compiler Arguments, Practical Difficulties, Block Compilation Declarations, Block Compilation -\subsection{Compiler Arguments} -\label{compile-file-block} -\cpsubindex{compile-file}{block compilation arguments} - -The \kwd{block-compile} and \kwd{entry-points} arguments to -\code{extensions:compile-from-stream} and \funref{compile-file} provide overall -control of block compilation, and allow block compilation without requiring -modification of the program source. - -There are three possible values of the \kwd{block-compile} argument: -\begin{Lentry} - -\item[\false{}] Do no compile-time resolution of global function - names, not even for self-recursive calls. This inhibits any - \code{start-block} declarations appearing in the file, allowing all - functions to be incrementally redefined. - -\item[\true{}] Start compiling in block compilation mode. This is - mainly useful for block compiling small files that contain no - \code{start-block} declarations. See also the \kwd{entry-points} - argument. - -\item[\kwd{specified}] Start compiling in form-at-a-time mode, but - exploit \code{start-block} declarations and compile self-recursive - calls as local calls. Normally \kwd{specified} is the default for - this argument (see \varref{block-compile-default}.) -\end{Lentry} - -The \kwd{entry-points} argument can be used in conjunction with -\w{\kwd{block-compile} \true{}} to specify the entry-points to a -block-compiled file. If not specified or \nil, all global functions -will be compiled as entry points. When \kwd{block-compile} is not -\true, this argument is ignored. - -\begin{defvar}{}{block-compile-default} - - This variable determines the default value for the - \kwd{block-compile} argument to \code{compile-file} and - \code{compile-from-stream}. The initial value of this variable is - \kwd{specified}, but \false{} is sometimes useful for totally - inhibiting block compilation. -\end{defvar} - -%%\node Practical Difficulties, Context Declarations, Compiler Arguments, Block Compilation -\subsection{Practical Difficulties} - -The main problem with block compilation is that the compiler uses -large amounts of memory when it is block compiling. This places an -upper limit on the amount of code that can be block compiled as a -unit. To make best use of block compilation, it is necessary to -locate the parts of the program containing many internal calls, and -then add the appropriate \code{start-block} declarations. When writing -new code, it is a good idea to put in block compilation declarations -from the very beginning, since writing block declarations correctly -requires accurate knowledge of the program's function call structure. -If you want to initially develop code with full incremental -redefinition, you can compile with \varref{block-compile-default} set to -\false. - -Note if a \code{defun} appears in a non-null lexical environment, then -calls to it cannot be block compiled. - -Unless files are very small, it is probably impractical to block compile -multiple files as a unit by specifying a list of files to \code{compile-file}. -Semi-inline expansion (\pxlref{semi-inline}) provides another way to -extend block compilation across file boundaries. -%% -%%\node Context Declarations, Context Declaration Example, Practical Difficulties, Block Compilation -\subsection{Context Declarations} -\label{context-declarations} -\cindex{context sensitive declarations} -\cpsubindex{declarations}{context-sensitive} - -\cmucl{} has a context-sensitive declaration mechanism which is useful -because it allows flexible control of the compilation policy in large -systems without requiring changes to the source files. The primary -use of this feature is to allow the exported interfaces of a system to -be compiled more safely than the system internals. The context used -is the name being defined and the kind of definition (function, macro, -etc.) - -The \kwd{context-declarations} option to \macref{with-compilation-unit} has -dynamic scope, affecting all compilation done during the evaluation of the -body. The argument to this option should evaluate to a list of lists of the -form: -\begin{example} -(\var{context-spec} \mplus{\var{declare-form}}) -\end{example} -In the indicated context, the specified declare forms are inserted at -the head of each definition. The declare forms for all contexts that -match are appended together, with earlier declarations getting -precedence over later ones. A simple example: -\begin{example} - :context-declarations - '((:external (declare (optimize (safety 2))))) -\end{example} -This will cause all functions that are named by external symbols to be -compiled with \code{safety 2}. - -The full syntax of context specs is: -\begin{Lentry} - -\item[\kwd{internal}, \kwd{external}] True if the symbol is internal - (external) in its home package. - -\item[\kwd{uninterned}] True if the symbol has no home package. - -\item[\code{\w{(:package \mstar{\var{package-name}})}}] True if the - symbol's home package is in any of the named packages (false if - uninterned.) - -\item[\kwd{anonymous}] True if the function doesn't have any - interesting name (not \code{defmacro}, \code{defun}, \code{labels} - or \code{flet}). - -\item[\kwd{macro}, \kwd{function}] \kwd{macro} is a global - (\code{defmacro}) macro. \kwd{function} is anything else. - -\item[\kwd{local}, \kwd{global}] \kwd{local} is a \code{labels} or - \code{flet}. \kwd{global} is anything else. - -\item[\code{\w{(:or \mstar{\var{context-spec}})}}] True when any - supplied \var{context-spec} is true. - -\item[\code{\w{(:and \mstar{\var{context-spec}})}}] True only when all - supplied \var{context-spec}s are true. - -\item[\code{\w{(:not \mstar{\var{context-spec}})}}] True when - \var{context-spec} is false. - -\item[\code{\w{(:member \mstar{\var{name}})}}] True when the defined - name is one of these names (\code{equal} test.) - -\item[\code{\w{(:match \mstar{\var{pattern}})}}] True when any of the - patterns is a substring of the name. The name is wrapped with - \code{\$}'s, so ``\code{\$FOO}'' matches names beginning with - ``\code{FOO}'', etc. -\end{Lentry} - -%%\node Context Declaration Example, , Context Declarations, Block Compilation -\subsection{Context Declaration Example} - -Here is a more complex example of \code{with-compilation-unit} options: -\begin{example} -:optimize '(optimize (speed 2) (space 2) (inhibit-warnings 2) - (debug 1) (safety 0)) -:optimize-interface '(optimize-interface (safety 1) (debug 1)) -:context-declarations -'(((:or :external (:and (:match "\%") (:match "SET"))) - (declare (optimize-interface (safety 2)))) - ((:or (:and :external :macro) - (:match "\$PARSE-")) - (declare (optimize (safety 2))))) -\end{example} -The \code{optimize} and \code{extensions:optimize-interface} -declarations (\pxlref{optimize-declaration}) set up the global -compilation policy. The bodies of functions are to be compiled -completely unsafe (\code{safety 0}), but argument count and weakened -argument type checking is to be done when a function is called -(\code{speed 2 safety 1}). - -The first declaration specifies that all functions that are external -or whose names contain both ``\code{\%}'' and ``\code{SET}'' are to be -compiled compiled with completely safe interfaces (\code{safety 2}). -The reason for this particular \kwd{match} rule is that \code{setf} -inverse functions in this system tend to have both strings in their -name somewhere. We want \code{setf} inverses to be safe because they -are implicitly called by users even though their name is not exported. - -The second declaration makes external macros or functions whose names -start with ``\code{PARSE-}'' have safe bodies (as well as interfaces). -This is desirable because a syntax error in a macro may cause a type -error inside the body. The \kwd{match} rule is used because macros -often have auxiliary functions whose names begin with this string. - -This particular example is used to build part of the standard \cmucl{} -system. Note however, that context declarations must be set up -according to the needs and coding conventions of a particular system; -different parts of \cmucl{} are compiled with different context -declarations, and your system will probably need its own declarations. -In particular, any use of the \kwd{match} option depends on naming -conventions used in coding. - -%% -%%\node Inline Expansion, Byte Coded Compilation, Block Compilation, Advanced Compiler Use and Efficiency Hints -\section{Inline Expansion} -\label{inline-expansion} -\cindex{inline expansion} -\cpsubindex{expansion}{inline} -\cpsubindex{call}{inline} -\cpsubindex{function call}{inline} -\cpsubindex{optimization}{function call} - -Python can expand almost any function inline, including functions -with keyword arguments. The only restrictions are that keyword -argument keywords in the call must be constant, and that global -function definitions (\code{defun}) must be done in a null lexical -environment (not nested in a \code{let} or other binding form.) Local -functions (\code{flet}) can be inline expanded in any environment. -Combined with \python{}'s source-level optimization, inline expansion -can be used for things that formerly required macros for efficient -implementation. In \python, macros don't have any efficiency -advantage, so they need only be used where a macro's syntactic -flexibility is required. - -Inline expansion is a compiler optimization technique that reduces -the overhead of a function call by simply not doing the call: -instead, the compiler effectively rewrites the program to appear as -though the definition of the called function was inserted at each -call site. In \llisp, this is straightforwardly expressed by -inserting the \code{lambda} corresponding to the original definition: -\begin{lisp} -(proclaim '(inline my-1+)) -(defun my-1+ (x) (+ x 1)) - -(my-1+ someval) \result{} ((lambda (x) (+ x 1)) someval) -\end{lisp} - -When the function expanded inline is large, the program after inline -expansion may be substantially larger than the original program. If -the program becomes too large, inline expansion hurts speed rather -than helping it, since hardware resources such as physical memory and -cache will be exhausted. Inline expansion is called for: -\begin{itemize} - -\item When profiling has shown that a relatively simple function is - called so often that a large amount of time is being wasted in the - calling of that function (as opposed to running in that function.) - If a function is complex, it will take a long time to run relative - the time spent in call, so the speed advantage of inline expansion - is diminished at the same time the space cost of inline expansion is - increased. Of course, if a function is rarely called, then the - overhead of calling it is also insignificant. - -\item With functions so simple that they take less space to inline - expand than would be taken to call the function (such as - \code{my-1+} above.) It would require intimate knowledge of the - compiler to be certain when inline expansion would reduce space, but - it is generally safe to inline expand functions whose definition is - a single function call, or a few calls to simple \clisp{} functions. -\end{itemize} - - -In addition to this speed/space tradeoff from inline expansion's -avoidance of the call, inline expansion can also reveal opportunities -for optimization. \python{}'s extensive source-level optimization can -make use of context information from the caller to tremendously -simplify the code resulting from the inline expansion of a function. - -The main form of caller context is local information about the actual -argument values: what the argument types are and whether the arguments -are constant. Knowledge about argument types can eliminate run-time -type tests (e.g., for generic arithmetic.) Constant arguments in a -call provide opportunities for constant folding optimization after -inline expansion. - -A hidden way that constant arguments are often supplied to functions -is through the defaulting of unsupplied optional or keyword arguments. -There can be a huge efficiency advantage to inline expanding functions -that have complex keyword-based interfaces, such as this definition of -the \code{member} function: -\begin{lisp} -(proclaim '(inline member)) -(defun member (item list &key - (key #'identity) - (test #'eql testp) - (test-not nil notp)) - (do ((list list (cdr list))) - ((null list) nil) - (let ((car (car list))) - (if (cond (testp - (funcall test item (funcall key car))) - (notp - (not (funcall test-not item (funcall key car)))) - (t - (funcall test item (funcall key car)))) - (return list))))) - -\end{lisp} -After inline expansion, this call is simplified to the obvious code: -\begin{lisp} -(member a l :key #'foo-a :test #'char=) \result{} - -(do ((list list (cdr list))) - ((null list) nil) - (let ((car (car list))) - (if (char= item (foo-a car)) - (return list)))) -\end{lisp} -In this example, there could easily be more than an order of magnitude -improvement in speed. In addition to eliminating the original call to -\code{member}, inline expansion also allows the calls to \code{char=} -and \code{foo-a} to be open-coded. We go from a loop with three tests -and two calls to a loop with one test and no calls. - -\xlref{source-optimization} for more discussion of source level -optimization. - -\begin{comment} -* Inline Expansion Recording:: -* Semi-Inline Expansion:: -* The Maybe-Inline Declaration:: -\end{comment} - -%%\node Inline Expansion Recording, Semi-Inline Expansion, Inline Expansion, Inline Expansion -\subsection{Inline Expansion Recording} -\cindex{recording of inline expansions} - -Inline expansion requires that the source for the inline expanded function to -be available when calls to the function are compiled. The compiler doesn't -remember the inline expansion for every function, since that would take an -excessive about of space. Instead, the programmer must tell the compiler to -record the inline expansion before the definition of the inline expanded -function is compiled. This is done by globally declaring the function inline -before the function is defined, by using the \code{inline} and -\code{extensions:maybe-inline} (\pxlref{maybe-inline-declaration}) -declarations. - -In addition to recording the inline expansion of inline functions at the time -the function is compiled, \code{compile-file} also puts the inline expansion in -the output file. When the output file is loaded, the inline expansion is made -available for subsequent compilations; there is no need to compile the -definition again to record the inline expansion. - -If a function is declared inline, but no expansion is recorded, then the -compiler will give an efficiency note like: -\begin{example} -Note: MYFUN is declared inline, but has no expansion. -\end{example} -When you get this note, check that the \code{inline} declaration and the -definition appear before the calls that are to be inline expanded. This note -will also be given if the inline expansion for a \code{defun} could not be -recorded because the \code{defun} was in a non-null lexical environment. - -%%\node Semi-Inline Expansion, The Maybe-Inline Declaration, Inline Expansion Recording, Inline Expansion -\subsection{Semi-Inline Expansion} -\label{semi-inline} - -Python supports \var{semi-inline} functions. Semi-inline expansion -shares a single copy of a function across all the calls in a component -by converting the inline expansion into a local function -(\pxlref{local-call}.) This takes up less space when there are -multiple calls, but also provides less opportunity for context -dependent optimization. When there is only one call, the result is -identical to normal inline expansion. Semi-inline expansion is done -when the \code{space} optimization quality is \code{0}, and the -function has been declared \code{extensions:maybe-inline}. - -This mechanism of inline expansion combined with local call also -allows recursive functions to be inline expanded. If a recursive -function is declared \code{inline}, calls will actually be compiled -semi-inline. Although recursive functions are often so complex that -there is little advantage to semi-inline expansion, it can still be -useful in the same sort of cases where normal inline expansion is -especially advantageous, i.e. functions where the calling context can -help a lot. - -%%\node The Maybe-Inline Declaration, , Semi-Inline Expansion, Inline Expansion -\subsection{The Maybe-Inline Declaration} -\label{maybe-inline-declaration} -\cindex{maybe-inline declaration} - -The \code{extensions:maybe-inline} declaration is a \cmucl{} -extension. It is similar to \code{inline}, but indicates that inline -expansion may sometimes be desirable, rather than saying that inline -expansion should almost always be done. When used in a global -declaration, \code{extensions:maybe-inline} causes the expansion for -the named functions to be recorded, but the functions aren't actually -inline expanded unless \code{space} is \code{0} or the function is -eventually (perhaps locally) declared \code{inline}. - -Use of the \code{extensions:maybe-inline} declaration followed by the -\code{defun} is preferable to the standard idiom of: -\begin{lisp} -(proclaim '(inline myfun)) -(defun myfun () ...) -(proclaim '(notinline myfun)) - -;;; \i{Any calls to \code{myfun} here are not inline expanded.} - -(defun somefun () - (declare (inline myfun)) - ;; - ;; \i{Calls to \code{myfun} here are inline expanded.} - ...) -\end{lisp} -The problem with using \code{notinline} in this way is that in -\clisp{} it does more than just suppress inline expansion, it also -forbids the compiler to use any knowledge of \code{myfun} until a -later \code{inline} declaration overrides the \code{notinline}. This -prevents compiler warnings about incorrect calls to the function, and -also prevents block compilation. - -The \code{extensions:maybe-inline} declaration is used like this: -\begin{lisp} -(proclaim '(extensions:maybe-inline myfun)) -(defun myfun () ...) - -;;; \i{Any calls to \code{myfun} here are not inline expanded.} - -(defun somefun () - (declare (inline myfun)) - ;; - ;; \i{Calls to \code{myfun} here are inline expanded.} - ...) - -(defun someotherfun () - (declare (optimize (space 0))) - ;; - ;; \i{Calls to \code{myfun} here are expanded semi-inline.} - ...) -\end{lisp} -In this example, the use of \code{extensions:maybe-inline} causes the -expansion to be recorded when the \code{defun} for \code{somefun} is -compiled, and doesn't waste space through doing inline expansion by -default. Unlike \code{notinline}, this declaration still allows the -compiler to assume that the known definition really is the one that -will be called when giving compiler warnings, and also allows the -compiler to do semi-inline expansion when the policy is appropriate. - -When the goal is merely to control whether inline expansion is done by -default, it is preferable to use \code{extensions:maybe-inline} rather -than \code{notinline}. The \code{notinline} declaration should be -reserved for those special occasions when a function may be redefined -at run-time, so the compiler must be told that the obvious definition -of a function is not necessarily the one that will be in effect at the -time of the call. - -%% -%%\node Byte Coded Compilation, Object Representation, Inline Expansion, Advanced Compiler Use and Efficiency Hints -\section{Byte Coded Compilation} -\label{byte-compile} -\cindex{byte coded compilation} -\cindex{space optimization} - -\Python{} supports byte compilation to reduce the size of Lisp -programs by allowing functions to be compiled more compactly. Byte -compilation provides an extreme speed/space tradeoff: byte code is -typically six times more compact than native code, but runs fifty -times (or more) slower. This is about ten times faster than the -standard interpreter, which is itself considered fast in comparison to -other \clisp{} interpreters. - -Large Lisp systems (such as \cmucl{} itself) often have large amounts -of user-interface code, compile-time (macro) code, debugging code, or -rarely executed special-case code. This code is a good target for -byte compilation: very little time is spent running in it, but it can -take up quite a bit of space. Straight-line code with many function -calls is much more suitable than inner loops. - -When byte-compiling, the compiler compiles about twice as fast, and -can produce a hardware independent object file (\file{.bytef} type.) -This file can be loaded like a normal fasl file on any implementation -of CMU CL with the same byte-ordering (DEC PMAX has \file{.lbytef} -type.) - -The decision to byte compile or native compile can be done on a -per-file or per-code-object basis. The \kwd{byte-compile} argument to -\funref{compile-file} has these possible values: -\begin{Lentry} -\item[\false{}] Don't byte compile anything in this file. - -\item[\true{}] Byte compile everything in this file and produce a - processor-independent \file{.bytef} file. - -\item[\kwd{maybe}] Produce a normal fasl file, but byte compile any - functions for which the \code{speed} optimization quality is - \code{0} and the \code{debug} quality is not greater than \code{1}. -\end{Lentry} - -\begin{defvar}{extensions:}{byte-compile-top-level} - - If this variable is true (the default) and the \kwd{byte-compile} - argument to \code{compile-file} is \kwd{maybe}, then byte compile - top-level code (code outside of any \code{defun}, \code{defmethod}, - etc.) -\end{defvar} - -\begin{defvar}{extensions:}{byte-compile-default} - - This variable determines the default value for the - \kwd{byte-compile} argument to \code{compile-file}, initially - \kwd{maybe}. -\end{defvar} - -%% -%%\node Object Representation, Numbers, Byte Coded Compilation, Advanced Compiler Use and Efficiency Hints -\section{Object Representation} -\label{object-representation} -\cindex{object representation} -\cpsubindex{representation}{object} -\cpsubindex{efficiency}{of objects} - -A somewhat subtle aspect of writing efficient \clisp{} programs is -choosing the correct data structures so that the underlying objects -can be implemented efficiently. This is partly because of the need -for multiple representations for a given value -(\pxlref{non-descriptor}), but is also due to the sheer number of -object types that \clisp{} has built in. The number of possible -representations complicates the choice of a good representation -because semantically similar objects may vary in their efficiency -depending on how the program operates on them. - -\begin{comment} -* Think Before You Use a List:: -* Structure Representation:: -* Arrays:: -* Vectors:: -* Bit-Vectors:: -* Hashtables:: -\end{comment} - -%%\node Think Before You Use a List, Structure Representation, Object Representation, Object Representation -\subsection{Think Before You Use a List} -\cpsubindex{lists}{efficiency of} - -Although Lisp's creator seemed to think that it was for LISt Processing, the -astute observer may have noticed that the chapter on list manipulation makes up -less that three percent of \i{Common Lisp: the Language II}. The language has -grown since Lisp 1.5\dash{}new data types supersede lists for many purposes. - -%%\node Structure Representation, Arrays, Think Before You Use a List, Object Representation -\subsection{Structure Representation} -\cpsubindex{structure types}{efficiency of} One of the best ways of -building complex data structures is to define appropriate structure -types using \findexed{defstruct}. In \python, access of structure -slots is always at least as fast as list or vector access, and is -usually faster. In comparison to a list representation of a tuple, -structures also have a space advantage. - -Even if structures weren't more efficient than other representations, structure -use would still be attractive because programs that use structures in -appropriate ways are much more maintainable and robust than programs written -using only lists. For example: -\begin{lisp} -(rplaca (caddr (cadddr x)) (caddr y)) -\end{lisp} -could have been written using structures in this way: -\begin{lisp} -(setf (beverage-flavor (astronaut-beverage x)) (beverage-flavor y)) -\end{lisp} -The second version is more maintainable because it is easier to -understand what it is doing. It is more robust because structures -accesses are type checked. An \code{astronaut} will never be confused -with a \code{beverage}, and the result of \code{beverage-flavor} is -always a flavor. See sections \ref{structure-types} and -\ref{freeze-type} for more information about structure types. -\xlref{type-inference} for a number of examples that make clear the -advantages of structure typing. - -Note that the structure definition should be compiled before any uses -of its accessors or type predicate so that these function calls can be -efficiently open-coded. - -%%\node Arrays, Vectors, Structure Representation, Object Representation -\subsection{Arrays} -\label{array-types} -\cpsubindex{arrays}{efficiency of} - -Arrays are often the most efficient representation for collections of objects -because: -\begin{itemize} - -\item Array representations are often the most compact. An array is - always more compact than a list containing the same number of - elements. - -\item Arrays allow fast constant-time access. - -\item Arrays are easily destructively modified, which can reduce - consing. - -\item Array element types can be specialized, which reduces both - overall size and consing (\pxlref{specialized-array-types}.) -\end{itemize} - - -Access of arrays that are not of type \code{simple-array} is less -efficient, so declarations are appropriate when an array is of a -simple type like \code{simple-string} or \code{simple-bit-vector}. -Arrays are almost always simple, but the compiler may not be able to -prove simpleness at every use. The only way to get a non-simple array -is to use the \kwd{displaced-to}, \kwd{fill-pointer} or -\code{adjustable} arguments to \code{make-array}. If you don't use -these hairy options, then arrays can always be declared to be simple. - -Because of the many specialized array types and the possibility of -non-simple arrays, array access is much like generic arithmetic -(\pxlref{generic-arithmetic}). In order for array accesses to be -efficiently compiled, the element type and simpleness of the array -must be known at compile time. If there is inadequate information, -the compiler is forced to call a generic array access routine. You -can detect inefficient array accesses by enabling efficiency notes, -\pxlref{efficiency-notes}. - -%%\node Vectors, Bit-Vectors, Arrays, Object Representation -\subsection{Vectors} -\cpsubindex{vectors}{efficiency of} - -Vectors (one dimensional arrays) are particularly useful, since in -addition to their obvious array-like applications, they are also well -suited to representing sequences. In comparison to a list -representation, vectors are faster to access and take up between two -and sixty-four times less space (depending on the element type.) As -with arbitrary arrays, the compiler needs to know that vectors are not -complex, so you should use \code{simple-string} in preference to -\code{string}, etc. - -The only advantage that lists have over vectors for representing -sequences is that it is easy to change the length of a list, add to it -and remove items from it. Likely signs of archaic, slow lisp code are -\code{nth} and \code{nthcdr}. If you are using these functions you -should probably be using a vector. - -%%\node Bit-Vectors, Hashtables, Vectors, Object Representation -\subsection{Bit-Vectors} -\cpsubindex{bit-vectors}{efficiency of} - -Another thing that lists have been used for is set manipulation. In -applications where there is a known, reasonably small universe of -items bit-vectors can be used to improve performance. This is much -less convenient than using lists, because instead of symbols, each -element in the universe must be assigned a numeric index into the bit -vector. Using a bit-vector will nearly always be faster, and can be -tremendously faster if the number of elements in the set is not small. -The logical operations on \code{simple-bit-vector}s are efficient, -since they operate on a word at a time. - - -%%\node Hashtables, , Bit-Vectors, Object Representation -\subsection{Hashtables} -\cpsubindex{hash-tables}{efficiency of} - -Hashtables are an efficient and general mechanism for maintaining associations -such as the association between an object and its name. Although hashtables -are usually the best way to maintain associations, efficiency and style -considerations sometimes favor the use of an association list (a-list). - -\code{assoc} is fairly fast when the \var{test} argument is \code{eq} -or \code{eql} and there are only a few elements, but the time goes up -in proportion with the number of elements. In contrast, the -hash-table lookup has a somewhat higher overhead, but the speed is -largely unaffected by the number of entries in the table. For an -\code{equal} hash-table or alist, hash-tables have an even greater -advantage, since the test is more expensive. Whatever you do, be sure -to use the most restrictive test function possible. - -The style argument observes that although hash-tables and alists -overlap in function, they do not do all things equally well. -\begin{itemize} - -\item Alists are good for maintaining scoped environments. They were - originally invented to implement scoping in the Lisp interpreter, - and are still used for this in \python. With an alist one can - non-destructively change an association simply by consing a new - element on the front. This is something that cannot be done with - hash-tables. - -\item Hashtables are good for maintaining a global association. The - value associated with an entry can easily be changed with - \code{setf}. With an alist, one has to go through contortions, - either \code{rplacd}'ing the cons if the entry exists, or pushing a - new one if it doesn't. The side-effecting nature of hash-table - operations is an advantage here. -\end{itemize} - - -Historically, symbol property lists were often used for global name -associations. Property lists provide an awkward and error-prone -combination of name association and record structure. If you must use -the property list, please store all the related values in a single -structure under a single property, rather than using many properties. -This makes access more efficient, and also adds a modicum of typing -and abstraction. \xlref{advanced-type-stuff} for information on types -in \cmucl. - -%% -%%\node Numbers, General Efficiency Hints, Object Representation, Advanced Compiler Use and Efficiency Hints -\section{Numbers} -\label{numeric-types} -\cpsubindex{numeric}{types} -\cpsubindex{types}{numeric} - -Numbers are interesting because numbers are one of the few \llisp{} data types -that have direct support in conventional hardware. If a number can be -represented in the way that the hardware expects it, then there is a big -efficiency advantage. - -Using hardware representations is problematical in \llisp{} due to -dynamic typing (where the type of a value may be unknown at compile -time.) It is possible to compile code for statically typed portions -of a \llisp{} program with efficiency comparable to that obtained in -statically typed languages such as C, but not all \llisp{} -implementations succeed. There are two main barriers to efficient -numerical code in \llisp{}: -\begin{itemize} - -\item The compiler must prove that the numerical expression is in fact - statically typed, and - -\item The compiler must be able to somehow reconcile the conflicting - demands of the hardware mandated number representation with the - \llisp{} requirements of dynamic typing and garbage-collecting - dynamic storage allocation. -\end{itemize} - -Because of its type inference (\pxlref{type-inference}) and efficiency -notes (\pxlref{efficiency-notes}), \python{} is better than -conventional \llisp{} compilers at ensuring that numerical expressions -are statically typed. Python also goes somewhat farther than existing -compilers in the area of allowing native machine number -representations in the presence of garbage collection. - -\begin{comment} -* Descriptors:: -* Non-Descriptor Representations:: -* Variables:: -* Generic Arithmetic:: -* Fixnums:: -* Word Integers:: -* Floating Point Efficiency:: -* Specialized Arrays:: -* Specialized Structure Slots:: -* Interactions With Local Call:: -* Representation of Characters:: -\end{comment} - -%%\node Descriptors, Non-Descriptor Representations, Numbers, Numbers -\subsection{Descriptors} -\cpsubindex{descriptors}{object} -\cindex{object representation} -\cpsubindex{representation}{object} -\cpsubindex{consing}{overhead of} - -\llisp{}'s dynamic typing requires that it be possible to represent -any value with a fixed length object, known as a \var{descriptor}. -This fixed-length requirement is implicit in features such as: -\begin{itemize} - -\item Data types (like \code{simple-vector}) that can contain any type - of object, and that can be destructively modified to contain - different objects (of possibly different types.) - -\item Functions that can be called with any type of argument, and that - can be redefined at run time. -\end{itemize} - -In order to save space, a descriptor is invariably represented as a -single word. Objects that can be directly represented in the -descriptor itself are said to be \var{immediate}. Descriptors for -objects larger than one word are in reality pointers to the memory -actually containing the object. - -Representing objects using pointers has two major disadvantages: -\begin{itemize} - -\item The memory pointed to must be allocated on the heap, so it must - eventually be freed by the garbage collector. Excessive heap - allocation of objects (or ``consing'') is inefficient in several - ways. \xlref{consing}. - -\item Representing an object in memory requires the compiler to emit - additional instructions to read the actual value in from memory, and - then to write the value back after operating on it. -\end{itemize} - -The introduction of garbage collection makes things even worse, since -the garbage collector must be able to determine whether a descriptor -is an immediate object or a pointer. This requires that a few bits in -each descriptor be dedicated to the garbage collector. The loss of a -few bits doesn't seem like much, but it has a major efficiency -implication\dash{}objects whose natural machine representation is a -full word (integers and single-floats) cannot have an immediate -representation. So the compiler is forced to use an unnatural -immediate representation (such as \code{fixnum}) or a natural pointer -representation (with the attendant consing overhead.) - - -%%\node Non-Descriptor Representations, Variables, Descriptors, Numbers -\subsection{Non-Descriptor Representations} -\label{non-descriptor} -\cindex{non-descriptor representations} -\cindex{stack numbers} - -From the discussion above, we can see that the standard descriptor -representation has many problems, the worst being number consing. -\llisp{} compilers try to avoid these descriptor efficiency problems by using -\var{non-descriptor} representations. A compiler that uses non-descriptor -representations can compile this function so that it does no number consing: -\begin{lisp} -(defun multby (vec n) - (declare (type (simple-array single-float (*)) vec) - (single-float n)) - (dotimes (i (length vec)) - (setf (aref vec i) - (* n (aref vec i))))) -\end{lisp} -If a descriptor representation were used, each iteration of the loop might -cons two floats and do three times as many memory references. - -As its negative definition suggests, the range of possible non-descriptor -representations is large. The performance improvement from non-descriptor -representation depends upon both the number of types that have non-descriptor -representations and the number of contexts in which the compiler is forced to -use a descriptor representation. - -Many \llisp{} compilers support non-descriptor representations for -float types such as \code{single-float} and \code{double-float} -(section \ref{float-efficiency}.) \python{} adds support for full -word integers (\pxlref{word-integers}), characters -(\pxlref{characters}) and system-area pointers (unconstrained -pointers, \pxlref{system-area-pointers}.) Many \llisp{} compilers -support non-descriptor representations for variables (section -\ref{ND-variables}) and array elements (section -\ref{specialized-array-types}.) \python{} adds support for -non-descriptor arguments and return values in local call -(\pxlref{number-local-call}) and structure slots (\pxlref{raw-slots}). - -%%\node Variables, Generic Arithmetic, Non-Descriptor Representations, Numbers -\subsection{Variables} -\label{ND-variables} -\cpsubindex{variables}{non-descriptor} -\cpsubindex{type declarations}{variable} -\cpsubindex{efficiency}{of numeric variables} - -In order to use a non-descriptor representation for a variable or -expression intermediate value, the compiler must be able to prove that -the value is always of a particular type having a non-descriptor -representation. Type inference (\pxlref{type-inference}) often needs -some help from user-supplied declarations. The best kind of type -declaration is a variable type declaration placed at the binding -point: -\begin{lisp} -(let ((x (car l))) - (declare (single-float x)) - ...) -\end{lisp} -Use of \code{the}, or of variable declarations not at the binding form -is insufficient to allow non-descriptor representation of the -variable\dash{}with these declarations it is not certain that all -values of the variable are of the right type. It is sometimes useful -to introduce a gratuitous binding that allows the compiler to change -to a non-descriptor representation, like: -\begin{lisp} -(etypecase x - ((signed-byte 32) - (let ((x x)) - (declare (type (signed-byte 32) x)) - ...)) - ...) -\end{lisp} -The declaration on the inner \code{x} is necessary here due to a phase -ordering problem. Although the compiler will eventually prove that -the outer \code{x} is a \w{\code{(signed-byte 32)}} within that -\code{etypecase} branch, the inner \code{x} would have been optimized -away by that time. Declaring the type makes let optimization more -cautious. - -Note that storing a value into a global (or \code{special}) variable -always forces a descriptor representation. Wherever possible, you -should operate only on local variables, binding any referenced globals -to local variables at the beginning of the function, and doing any -global assignments at the end. - -Efficiency notes signal use of inefficient representations, so -programmer's needn't continuously worry about the details of -representation selection (\pxlref{representation-eff-note}.) - -%%\node Generic Arithmetic, Fixnums, Variables, Numbers -\subsection{Generic Arithmetic} -\label{generic-arithmetic} -\cindex{generic arithmetic} -\cpsubindex{arithmetic}{generic} -\cpsubindex{numeric}{operation efficiency} - -In \clisp, arithmetic operations are \var{generic}.\footnote{As Steele - notes in CLTL II, this is a generic conception of generic, and is - not to be confused with the CLOS concept of a generic function.} -The \code{+} function can be passed \code{fixnum}s, \code{bignum}s, -\code{ratio}s, and various kinds of \code{float}s and -\code{complex}es, in any combination. In addition to the inherent -complexity of \code{bignum} and \code{ratio} operations, there is also -a lot of overhead in just figuring out which operation to do and what -contagion and canonicalization rules apply. The complexity of generic -arithmetic is so great that it is inconceivable to open code it. -Instead, the compiler does a function call to a generic arithmetic -routine, consuming many instructions before the actual computation -even starts. - -This is ridiculous, since even \llisp{} programs do a lot of -arithmetic, and the hardware is capable of doing operations on small -integers and floats with a single instruction. To get acceptable -efficiency, the compiler special-cases uses of generic arithmetic that -are directly implemented in the hardware. In order to open code -arithmetic, several constraints must be met: -\begin{itemize} - -\item All the arguments must be known to be a good type of number. - -\item The result must be known to be a good type of number. - -\item Any intermediate values such as the result of \w{\code{(+ a b)}} - in the call \w{\code{(+ a b c)}} must be known to be a good type of - number. - -\item All the above numbers with good types must be of the \var{same} - good type. Don't try to mix integers and floats or different float - formats. -\end{itemize} - -The ``good types'' are \w{\code{(signed-byte 32)}}, -\w{\code{(unsigned-byte 32)}}, \code{single-float} and -\code{double-float}. See sections \ref{fixnums}, \ref{word-integers} -and \ref{float-efficiency} for more discussion of good numeric types. - -\code{float} is not a good type, since it might mean either -\code{single-float} or \code{double-float}. \code{integer} is not a -good type, since it might mean \code{bignum}. \code{rational} is not -a good type, since it might mean \code{ratio}. Note however that -these types are still useful in declarations, since type inference may -be able to strengthen a weak declaration into a good one, when it -would be at a loss if there was no declaration at all -(\pxlref{type-inference}). The \code{integer} and -\code{unsigned-byte} (or non-negative integer) types are especially -useful in this regard, since they can often be strengthened to a good -integer type. - -Arithmetic with \code{complex} numbers is inefficient in comparison to -float and integer arithmetic. Complex numbers are always represented -with a pointer descriptor (causing consing overhead), and complex -arithmetic is always closed coded using the general generic arithmetic -functions. But arithmetic with complex types such as: -\begin{lisp} -(complex float) -(complex fixnum) -\end{lisp} -is still faster than \code{bignum} or \code{ratio} arithmetic, since the -implementation is much simpler. - -Note: don't use \code{/} to divide integers unless you want the -overhead of rational arithmetic. Use \code{truncate} even when you -know that the arguments divide evenly. - -You don't need to remember all the rules for how to get open-coded -arithmetic, since efficiency notes will tell you when and where there -is a problem\dash{}\pxlref{efficiency-notes}. - - -%%\node Fixnums, Word Integers, Generic Arithmetic, Numbers -\subsection{Fixnums} -\label{fixnums} -\cindex{fixnums} -\cindex{bignums} - -A fixnum is a ``FIXed precision NUMber''. In modern \llisp{} -implementations, fixnums can be represented with an immediate -descriptor, so operating on fixnums requires no consing or memory -references. Clever choice of representations also allows some -arithmetic operations to be done on fixnums using hardware supported -word-integer instructions, somewhat reducing the speed penalty for -using an unnatural integer representation. - -It is useful to distinguish the \code{fixnum} type from the fixnum -representation of integers. In \python, there is absolutely nothing -magical about the \code{fixnum} type in comparison to other finite -integer types. \code{fixnum} is equivalent to (is defined with -\code{deftype} to be) \w{\code{(signed-byte 30)}}. \code{fixnum} is -simply the largest subset of integers that \i{can be represented} -using an immediate fixnum descriptor. - -Unlike in other \clisp{} compilers, it is in no way desirable to use -the \code{fixnum} type in declarations in preference to more -restrictive integer types such as \code{bit}, \w{\code{(integer -43 - 7)}} and \w{\code{(unsigned-byte 8)}}. Since Python does -understand these integer types, it is preferable to use the more -restrictive type, as it allows better type inference -(\pxlref{operation-type-inference}.) - -The small, efficient fixnum is contrasted with bignum, or ``BIG -NUMber''. This is another descriptor representation for integers, but -this time a pointer representation that allows for arbitrarily large -integers. Bignum operations are less efficient than fixnum -operations, both because of the consing and memory reference overheads -of a pointer descriptor, and also because of the inherent complexity -of extended precision arithmetic. While fixnum operations can often -be done with a single instruction, bignum operations are so complex -that they are always done using generic arithmetic. - -A crucial point is that the compiler will use generic arithmetic if it -can't \var{prove} that all the arguments, intermediate values, and -results are fixnums. With bounded integer types such as -\code{fixnum}, the result type proves to be especially problematical, -since these types are not closed under common arithmetic operations -such as \code{+}, \code{-}, \code{*} and \code{/}. For example, -\w{\code{(1+ (the fixnum x))}} does not necessarily evaluate to a -\code{fixnum}. Bignums were added to \llisp{} to get around this -problem, but they really just transform the correctness problem ``if -this add overflows, you will get the wrong answer'' to the efficiency -problem ``if this add \var{might} overflow then your program will run -slowly (because of generic arithmetic.)'' - -There is just no getting around the fact that the hardware only -directly supports short integers. To get the most efficient open -coding, the compiler must be able to prove that the result is a good -integer type. This is an argument in favor of using more restrictive -integer types: \w{\code{(1+ (the fixnum x))}} may not always be a -\code{fixnum}, but \w{\code{(1+ (the (unsigned-byte 8) x))}} always -is. Of course, you can also assert the result type by putting in lots -of \code{the} declarations and then compiling with \code{safety} -\code{0}. - -%%\node Word Integers, Floating Point Efficiency, Fixnums, Numbers -\subsection{Word Integers} -\label{word-integers} -\cindex{word integers} - -Python is unique in its efficient implementation of arithmetic -on full-word integers through non-descriptor representations and open coding. -Arithmetic on any subtype of these types: -\begin{lisp} -(signed-byte 32) -(unsigned-byte 32) -\end{lisp} -is reasonably efficient, although subtypes of \code{fixnum} remain -somewhat more efficient. - -If a word integer must be represented as a descriptor, then the -\code{bignum} representation is used, with its associated consing -overhead. The support for word integers in no way changes the -language semantics, it just makes arithmetic on small bignums vastly -more efficient. It is fine to do arithmetic operations with mixed -\code{fixnum} and word integer operands; just declare the most -specific integer type you can, and let the compiler decide what -representation to use. - -In fact, to most users, the greatest advantage of word integer -arithmetic is that it effectively provides a few guard bits on the -fixnum representation. If there are missing assertions on -intermediate values in a fixnum expression, the intermediate results -can usually be proved to fit in a word. After the whole expression is -evaluated, there will often be a fixnum assertion on the final result, -allowing creation of a fixnum result without even checking for -overflow. - -The remarks in section \ref{fixnums} about fixnum result type also -apply to word integers; you must be careful to give the compiler -enough information to prove that the result is still a word integer. -This time, though, when we blow out of word integers we land in into -generic bignum arithmetic, which is much worse than sleazing from -\code{fixnum}s to word integers. Note that mixing -\w{\code{(unsigned-byte 32)}} arguments with arguments of any signed -type (such as \code{fixnum}) is a no-no, since the result might not be -unsigned. - -%%\node Floating Point Efficiency, Specialized Arrays, Word Integers, Numbers -\subsection{Floating Point Efficiency} -\label{float-efficiency} -\cindex{floating point efficiency} - -Arithmetic on objects of type \code{single-float} and \code{double-float} is -efficiently implemented using non-descriptor representations and open coding. -As for integer arithmetic, the arguments must be known to be of the same float -type. Unlike for integer arithmetic, the results and intermediate values -usually take care of themselves due to the rules of float contagion, i.e. -\w{\code{(1+ (the single-float x))}} is always a \code{single-float}. - -Although they are not specially implemented, \code{short-float} and -\code{long-float} are also acceptable in declarations, since they are -synonyms for the \code{single-float} and \code{double-float} types, -respectively. - -\begin{changebar} - Some versions of CMU Common Lisp include extra support for floating - point arithmetic. In particular, if \code{*features*} includes - \kwd{propagate-float-type}, list-style float type specifiers such as - \w{\code{(single-float 0.0 1.0)}} will be used to good effect. - - For example, in this function, - \begin{example} - (defun square (x) - (declare (type (single-float 0f0 10f0))) - (* x x)) - \end{example} - \Python{} can deduce that the - return type of the function \code{square} is \w{\code{(single-float - 0f0 100f0)}}. - - Many union types are also supported so that - \begin{example} - (+ (the (or (integer 1 1) (integer 5 5)) x) - (the (or (integer 10 10) (integer 20 20)) y)) - \end{example} - has the inferred type \code{(or (integer 11 11) (integer 15 15) - (integer 21 21) (integer 25 25))}. This also works for - floating-point numbers. Member types, however, are not because in - general the member elements do not have to be numbers. Thus, - instead of \code{(member 1 4)}, you should write \code{(or (integer - 1 1) (integer 4 4))}. - - In addition, if \kwd{propagate-fun-type} is in \code{*features*}, - \Python{} knows how to infer types for many mathematical functions - including square root, exponential and logarithmic functions, - trignometric functions and their inverses, and hyperbolic functions - and their inverses. For numeric code, this can greatly enhance - efficiency by allowing the compiler to use specialized versions of - the functions instead of the generic versions. The greatest benefit - of this type inference is determining that the result of the - function is real-valued number instead of possibly being - a complex-valued number. - - For example, consider the function - \begin{example} - (defun fun (x) - (declare (type (single-float 0f0 100f0) x)) - (values (sqrt x) (log x 10f0))) - \end{example} - With this declaration, the compiler can determine that the argument - to \code{sqrt} and \code{log} are always non-negative so that the result - is always a \code{single-float}. In fact, the return type for this - function is derived to be \code{(values (single-float 0f0 10f0) - (single-float * 2f0))}. - - If the declaration were reduced to just \w{\code{(declare - single-float x)}}, the argument to \code{sqrt} and \code{log} - could be negative. This forces the use of the generic versions of - these functions because the result could be a complex number. - - Union types are not yet supported for functions. - - We note, however, that proper interval arithmetic is not fully - implemented in the compiler so the inferred types may be slightly in - error due to round-off errors. This round-off error could - accumulate to cause the compiler to erroneously deduce the result - type and cause code to be removed as being - unreachable.\footnote{This, however, has not actually happened, but - it is a possibility.}% - Thus, the declarations should only be precise enough for the - compiler to deduce that a real-valued argument to a function would - produce a real-valued result. The efficiency notes - (\pxlref{representation-eff-note}) from the compiler will guide you - on what declarations might be useful. -\end{changebar} - -When a float must be represented as a descriptor, a pointer representation is -used, creating consing overhead. For this reason, you should try to avoid -situations (such as full call and non-specialized data structures) that force a -descriptor representation. See sections \ref{specialized-array-types}, -\ref{raw-slots} and \ref{number-local-call}. - -\xlref{ieee-float} for information on the extensions to support IEEE -floating point. - -%%\node Specialized Arrays, Specialized Structure Slots, Floating Point Efficiency, Numbers -\subsection{Specialized Arrays} -\label{specialized-array-types} -\cindex{specialized array types} -\cpsubindex{array types}{specialized} -\cpsubindex{types}{specialized array} - -\clisp{} supports specialized array element types through the -\kwd{element-type} argument to \code{make-array}. When an array has a -specialized element type, only elements of that type can be stored in -the array. From this restriction comes two major efficiency -advantages: -\begin{itemize} - -\item A specialized array can save space by packing multiple elements - into a single word. For example, a \code{base-char} array can have - 4 elements per word, and a \code{bit} array can have 32. This - space-efficient representation is possible because it is not - necessary to separately indicate the type of each element. - -\item The elements in a specialized array can be given the same - non-descriptor representation as the one used in registers and on - the stack, eliminating the need for representation conversions when - reading and writing array elements. For objects with pointer - descriptor representations (such as floats and word integers) there - is also a substantial consing reduction because it is not necessary - to allocate a new object every time an array element is modified. -\end{itemize} - - -These are the specialized element types currently supported: -\begin{lisp} -bit -(unsigned-byte 2) -(unsigned-byte 4) -(unsigned-byte 8) -(unsigned-byte 16) -(unsigned-byte 32) -base-character -single-float -double-float -\end{lisp} -\begin{changebar} -%% New stuff -Some versions of \cmucl{}\footnote{Currently, this includes the X86 - and Sparc versions which are compiled with the \kwd{signed-array} - feature.} also support the following specialized element types: -\begin{lisp} -(signed-byte 8) -(signed-byte 16) -(signed-byte 30) -(signed-byte 32) -\end{lisp} -\end{changebar} -Although a \code{simple-vector} can hold any type of object, \true{} -should still be considered a specialized array type, since arrays with -element type \true{} are specialized to hold descriptors. - - - -When using non-descriptor representations, it is particularly -important to make sure that array accesses are open-coded, since in -addition to the generic operation overhead, efficiency is lost when -the array element is converted to a descriptor so that it can be -passed to (or from) the generic access routine. You can detect -inefficient array accesses by enabling efficiency notes, -\pxlref{efficiency-notes}. \xlref{array-types}. - -%%\node Specialized Structure Slots, Interactions With Local Call, Specialized Arrays, Numbers -\subsection{Specialized Structure Slots} -\label{raw-slots} -\cpsubindex{structure types}{numeric slots} -\cindex{specialized structure slots} - -Structure slots declared by the \kwd{type} \code{defstruct} slot option -to have certain known numeric types are also given non-descriptor -representations. These types (and subtypes of these types) are supported: -\begin{lisp} -(unsigned-byte 32) -single-float -double-float -\end{lisp} - -The primary advantage of specialized slot representations is a large -reduction spurious memory allocation and access overhead of programs -that intensively use these types. - -%%\node Interactions With Local Call, Representation of Characters, Specialized Structure Slots, Numbers -\subsection{Interactions With Local Call} -\label{number-local-call} -\cpsubindex{local call}{numeric operands} -\cpsubindex{call}{numeric operands} -\cindex{numbers in local call} - -Local call has many advantages (\pxlref{local-call}); one relevant to -our discussion here is that local call extends the usefulness of -non-descriptor representations. If the compiler knows from the -argument type that an argument has a non-descriptor representation, -then the argument will be passed in that representation. The easiest -way to ensure that the argument type is known at compile time is to -always declare the argument type in the called function, like: -\begin{lisp} -(defun 2+f (x) - (declare (single-float x)) - (+ x 2.0)) -\end{lisp} -The advantages of passing arguments and return values in a non-descriptor -representation are the same as for non-descriptor representations in general: -reduced consing and memory access (\pxlref{non-descriptor}.) This -extends the applicative programming styles discussed in section -\ref{local-call} to numeric code. Also, if source files are kept reasonably -small, block compilation can be used to reduce number consing to a minimum. - -Note that non-descriptor return values can only be used with the known return -convention (section \ref{local-call-return}.) If the compiler can't prove that -a function always returns the same number of values, then it must use the -unknown values return convention, which requires a descriptor representation. -Pay attention to the known return efficiency notes to avoid number consing. - -%%\node Representation of Characters, , Interactions With Local Call, Numbers -\subsection{Representation of Characters} -\label{characters} -\cindex{characters} -\cindex{strings} - -Python also uses a non-descriptor representation for characters when -convenient. This improves the efficiency of string manipulation, but is -otherwise pretty invisible; characters have an immediate descriptor -representation, so there is not a great penalty for converting a character to a -descriptor. Nonetheless, it may sometimes be helpful to declare -character-valued variables as \code{base-character}. - -%% -%%\node General Efficiency Hints, Efficiency Notes, Numbers, Advanced Compiler Use and Efficiency Hints -\section{General Efficiency Hints} -\label{general-efficiency} -\cpsubindex{efficiency}{general hints} - -This section is a summary of various implementation costs and ways to get -around them. These hints are relatively unrelated to the use of the \python{} -compiler, and probably also apply to most other \llisp{} implementations. In -each section, there are references to related in-depth discussion. - -\begin{comment} -* Compile Your Code:: -* Avoid Unnecessary Consing:: -* Complex Argument Syntax:: -* Mapping and Iteration:: -* Trace Files and Disassembly:: -\end{comment} - -%%\node Compile Your Code, Avoid Unnecessary Consing, General Efficiency Hints, General Efficiency Hints -\subsection{Compile Your Code} -\cpsubindex{compilation}{why to} - -At this point, the advantages of compiling code relative to running it -interpreted probably need not be emphasized too much, but remember that -in \cmucl, compiled code typically runs hundreds of times faster than -interpreted code. Also, compiled (\code{fasl}) files load significantly faster -than source files, so it is worthwhile compiling files which are loaded many -times, even if the speed of the functions in the file is unimportant. - -Even disregarding the efficiency advantages, compiled code is as good or better -than interpreted code. Compiled code can be debugged at the source level (see -chapter \ref{debugger}), and compiled code does more error checking. For these -reasons, the interpreter should be regarded mainly as an interactive command -interpreter, rather than as a programming language implementation. - -\b{Do not} be concerned about the performance of your program until you -see its speed compiled. Some techniques that make compiled code run -faster make interpreted code run slower. - -%%\node Avoid Unnecessary Consing, Complex Argument Syntax, Compile Your Code, General Efficiency Hints -\subsection{Avoid Unnecessary Consing} -\label{consing} -\cindex{consing} -\cindex{garbage collection} -\cindex{memory allocation} -\cpsubindex{efficiency}{of memory use} - - -Consing is another name for allocation of storage, as done by the -\code{cons} function (hence its name.) \code{cons} is by no means the -only function which conses\dash{}so does \code{make-array} and many -other functions. Arithmetic and function call can also have hidden -consing overheads. Consing hurts performance in the following ways: -\begin{itemize} - -\item Consing reduces memory access locality, increasing paging - activity. - -\item Consing takes time just like anything else. - -\item Any space allocated eventually needs to be reclaimed, either by - garbage collection or by starting a new \code{lisp} process. -\end{itemize} - - -Consing is not undiluted evil, since programs do things other than -consing, and appropriate consing can speed up the real work. It would -certainly save time to allocate a vector of intermediate results that -are reused hundreds of times. Also, if it is necessary to copy a -large data structure many times, it may be more efficient to update -the data structure non-destructively; this somewhat increases update -overhead, but makes copying trivial. - -Note that the remarks in section \ref{efficiency-overview} about the -importance of separating tuning from coding also apply to consing -overhead. The majority of consing will be done by a small portion of -the program. The consing hot spots are even less predictable than the -CPU hot spots, so don't waste time and create bugs by doing -unnecessary consing optimization. During initial coding, avoid -unnecessary side-effects and cons where it is convenient. If -profiling reveals a consing problem, \var{then} go back and fix the -hot spots. - -\xlref{non-descriptor} for a discussion of how to avoid number consing -in \python. - - -%%\node Complex Argument Syntax, Mapping and Iteration, Avoid Unnecessary Consing, General Efficiency Hints -\subsection{Complex Argument Syntax} -\cpsubindex{argument syntax}{efficiency} -\cpsubindex{efficiency}{of argument syntax} -\cindex{keyword argument efficiency} -\cindex{rest argument efficiency} - -Common Lisp has very powerful argument passing mechanisms. Unfortunately, two -of the most powerful mechanisms, rest arguments and keyword arguments, have a -significant performance penalty: -\begin{itemize} - -\item -With keyword arguments, the called function has to parse the supplied keywords -by iterating over them and checking them against the desired keywords. - -\item -With rest arguments, the function must cons a list to hold the arguments. If a -function is called many times or with many arguments, large amounts of memory -will be allocated. -\end{itemize} - -Although rest argument consing is worse than keyword parsing, neither problem -is serious unless thousands of calls are made to such a function. The use of -keyword arguments is strongly encouraged in functions with many arguments or -with interfaces that are likely to be extended, and rest arguments are often -natural in user interface functions. - -Optional arguments have some efficiency advantage over keyword -arguments, but their syntactic clumsiness and lack of extensibility -has caused many \clisp{} programmers to abandon use of optionals -except in functions that have obviously simple and immutable -interfaces (such as \code{subseq}), or in functions that are only -called in a few places. When defining an interface function to be -used by other programmers or users, use of only required and keyword -arguments is recommended. - -Parsing of \code{defmacro} keyword and rest arguments is done at -compile time, so a macro can be used to provide a convenient syntax -with an efficient implementation. If the macro-expanded form contains -no keyword or rest arguments, then it is perfectly acceptable in inner -loops. - -Keyword argument parsing overhead can also be avoided by use of inline -expansion (\pxlref{inline-expansion}) and block compilation (section -\ref{block-compilation}.) - -Note: the compiler open-codes most heavily used system functions which have -keyword or rest arguments, so that no run-time overhead is involved. - -%%\node Mapping and Iteration, Trace Files and Disassembly, Complex Argument Syntax, General Efficiency Hints -\subsection{Mapping and Iteration} -\cpsubindex{mapping}{efficiency of} - -One of the traditional \llisp{} programming styles is a highly applicative one, -involving the use of mapping functions and many lists to store intermediate -results. To compute the sum of the square-roots of a list of numbers, one -might say: -\begin{lisp} -(apply #'+ (mapcar #'sqrt list-of-numbers)) -\end{lisp} - -This programming style is clear and elegant, but unfortunately results -in slow code. There are two reasons why: -\begin{itemize} - -\item The creation of lists of intermediate results causes much - consing (see \ref{consing}). - -\item Each level of application requires another scan down the list. - Thus, disregarding other effects, the above code would probably take - twice as long as a straightforward iterative version. -\end{itemize} - - -An example of an iterative version of the same code: -\begin{lisp} -(do ((num list-of-numbers (cdr num)) - (sum 0 (+ (sqrt (car num)) sum))) - ((null num) sum)) -\end{lisp} - -See sections \ref{variable-type-inference} and \ref{let-optimization} -for a discussion of the interactions of iteration constructs with type -inference and variable optimization. Also, section -\ref{local-tail-recursion} discusses an applicative style of -iteration. - -%%\node Trace Files and Disassembly, , Mapping and Iteration, General Efficiency Hints -\subsection{Trace Files and Disassembly} -\label{trace-files} -\cindex{trace files} -\cindex{assembly listing} -\cpsubindex{listing files}{trace} -\cindex{Virtual Machine (VM, or IR2) representation} -\cindex{implicit continuation representation (IR1)} -\cpsubindex{continuations}{implicit representation} - -In order to write efficient code, you need to know the relative costs -of different operations. The main reason why writing efficient -\llisp{} code is difficult is that there are so many operations, and -the costs of these operations vary in obscure context-dependent ways. -Although efficiency notes point out some problem areas, the only way -to ensure generation of the best code is to look at the assembly code -output. - -The \code{disassemble} function is a convenient way to get the assembly code for a -function, but it can be very difficult to interpret, since the correspondence -with the original source code is weak. A better (but more awkward) option is -to use the \kwd{trace-file} argument to \code{compile-file} to generate a trace -file. - -A trace file is a dump of the compiler's internal representations, -including annotated assembly code. Each component in the program gets -four pages in the trace file (separated by ``\code{$\hat{ }L$}''): -\begin{itemize} - -\item The implicit-continuation (or IR1) representation of the - optimized source. This is a dump of the flow graph representation - used for ``source level'' optimizations. As you will quickly - notice, it is not really very close to the source. This - representation is not very useful to even sophisticated users. - -\item The Virtual Machine (VM, or IR2) representation of the program. - This dump represents the generated code as sequences of ``Virtual - OPerations'' (VOPs.) This representation is intermediate between - the source and the assembly code\dash{}each VOP corresponds fairly - directly to some primitive function or construct, but a given VOP - also has a fairly predictable instruction sequence. An operation - (such as \code{+}) may have multiple implementations with different - cost and applicability. The choice of a particular VOP such as - \code{+/fixnum} or \code{+/single-float} represents this choice of - implementation. Once you are familiar with it, the VM - representation is probably the most useful for determining what - implementation has been used. - -\item An assembly listing, annotated with the VOP responsible for - generating the instructions. This listing is useful for figuring - out what a VOP does and how it is implemented in a particular - context, but its large size makes it more difficult to read. - -\item A disassembly of the generated code, which has all - pseudo-operations expanded out, but is not annotated with VOPs. -\end{itemize} - - -Note that trace file generation takes much space and time, since the trace file -is tens of times larger than the source file. To avoid huge confusing trace -files and much wasted time, it is best to separate the critical program portion -into its own file and then generate the trace file from this small file. - -%% -%%\node Efficiency Notes, Profiling, General Efficiency Hints, Advanced Compiler Use and Efficiency Hints -\section{Efficiency Notes} -\label{efficiency-notes} -\cindex{efficiency notes} -\cpsubindex{notes}{efficiency} -\cindex{tuning} - -Efficiency notes are messages that warn the user that the compiler has -chosen a relatively inefficient implementation for some operation. -Usually an efficiency note reflects the compiler's desire for more -type information. If the type of the values concerned is known to the -programmer, then additional declarations can be used to get a more -efficient implementation. - -Efficiency notes are controlled by the -\code{extensions:inhibit-warnings} (\pxlref{optimize-declaration}) -optimization quality. When \code{speed} is greater than -\code{extensions:inhibit-warnings}, efficiency notes are enabled. -Note that this implicitly enables efficiency notes whenever -\code{speed} is increased from its default of \code{1}. - -Consider this program with an obscure missing declaration: -\begin{lisp} -(defun eff-note (x y z) - (declare (fixnum x y z)) - (the fixnum (+ x y z))) -\end{lisp} -If compiled with \code{\w{(speed 3) (safety 0)}}, this note is given: -\begin{example} -In: DEFUN EFF-NOTE - (+ X Y Z) -==> - (+ (+ X Y) Z) -Note: Forced to do inline (signed-byte 32) arithmetic (cost 3). - Unable to do inline fixnum arithmetic (cost 2) because: - The first argument is a (INTEGER -1073741824 1073741822), - not a FIXNUM. -\end{example} -This efficiency note tells us that the result of the intermediate -computation \code{\w{(+ x y)}} is not known to be a \code{fixnum}, so -the addition of the intermediate sum to \code{z} must be done less -efficiently. This can be fixed by changing the definition of -\code{eff-note}: -\begin{lisp} -(defun eff-note (x y z) - (declare (fixnum x y z)) - (the fixnum (+ (the fixnum (+ x y)) z))) -\end{lisp} - -\begin{comment} -* Type Uncertainty:: -* Efficiency Notes and Type Checking:: -* Representation Efficiency Notes:: -* Verbosity Control:: -\end{comment} - -%%\node Type Uncertainty, Efficiency Notes and Type Checking, Efficiency Notes, Efficiency Notes -\subsection{Type Uncertainty} -\cpsubindex{types}{uncertainty} -\cindex{uncertainty of types} - -The main cause of inefficiency is the compiler's lack of adequate -information about the types of function argument and result values. -Many important operations (such as arithmetic) have an inefficient -general (generic) case, but have efficient implementations that can -usually be used if there is sufficient argument type information. - -Type efficiency notes are given when a value's type is uncertain. -There is an important distinction between values that are \i{not - known} to be of a good type (uncertain) and values that are \i{known - not} to be of a good type. Efficiency notes are given mainly for -the first case (uncertain types.) If it is clear to the compiler that -that there is not an efficient implementation for a particular -function call, then an efficiency note will only be given if the -\code{extensions:inhibit-warnings} optimization quality is \code{0} -(\pxlref{optimize-declaration}.) - -In other words, the default efficiency notes only suggest that you add -declarations, not that you change the semantics of your program so -that an efficient implementation will apply. For example, compilation -of this form will not give an efficiency note: -\begin{lisp} -(elt (the list l) i) -\end{lisp} -even though a vector access is more efficient than indexing a list. - -%%\node Efficiency Notes and Type Checking, Representation Efficiency Notes, Type Uncertainty, Efficiency Notes -\subsection{Efficiency Notes and Type Checking} -\cpsubindex{type checking}{efficiency of} -\cpsubindex{efficiency}{of type checking} -\cpsubindex{optimization}{type check} - -It is important that the \code{eff-note} example above used -\w{\code{(safety 0)}}. When type checking is enabled, you may get apparently -spurious efficiency notes. With \w{\code{(safety 1)}}, the note has this extra -line on the end: -\begin{example} -The result is a (INTEGER -1610612736 1610612733), not a FIXNUM. -\end{example} -This seems strange, since there is a \code{the} declaration on the result of that -second addition. - -In fact, the inefficiency is real, and is a consequence of \python{}'s -treating declarations as assertions to be verified. The compiler -can't assume that the result type declaration is true\dash{}it must -generate the result and then test whether it is of the appropriate -type. - -In practice, this means that when you are tuning a program to run -without type checks, you should work from the efficiency notes -generated by unsafe compilation. If you want code to run efficiently -with type checking, then you should pay attention to all the -efficiency notes that you get during safe compilation. Since user -supplied output type assertions (e.g., from \code{the}) are -disregarded when selecting operation implementations for safe code, -you must somehow give the compiler information that allows it to prove -that the result truly must be of a good type. In our example, it -could be done by constraining the argument types more: -\begin{lisp} -(defun eff-note (x y z) - (declare (type (unsigned-byte 18) x y z)) - (+ x y z)) -\end{lisp} -Of course, this declaration is acceptable only if the arguments to \code{eff-note} -always \var{are} \w{\code{(unsigned-byte 18)}} integers. - -%%\node Representation Efficiency Notes, Verbosity Control, Efficiency Notes and Type Checking, Efficiency Notes -\subsection{Representation Efficiency Notes} -\label{representation-eff-note} -\cindex{representation efficiency notes} -\cpsubindex{efficiency notes}{for representation} -\cindex{object representation efficiency notes} -\cindex{stack numbers} -\cindex{non-descriptor representations} -\cpsubindex{descriptor representations}{forcing of} - -When operating on values that have non-descriptor representations -(\pxlref{non-descriptor}), there can be a substantial time and consing -penalty for converting to and from descriptor representations. For -this reason, the compiler gives an efficiency note whenever it is -forced to do a representation coercion more expensive than -\varref{efficiency-note-cost-threshold}. - -Inefficient representation coercions may be due to type uncertainty, -as in this example: -\begin{lisp} -(defun set-flo (x) - (declare (single-float x)) - (prog ((var 0.0)) - (setq var (gorp)) - (setq var x) - (return var))) -\end{lisp} -which produces this efficiency note: -\begin{example} -In: DEFUN SET-FLO - (SETQ VAR X) -Note: Doing float to pointer coercion (cost 13) from X to VAR. -\end{example} -The variable \code{var} is not known to always hold values of type -\code{single-float}, so a descriptor representation must be used for its value. -In sort of situation, and adding a declaration will eliminate the inefficiency. - -Often inefficient representation conversions are not due to type -uncertainty\dash{}instead, they result from evaluating a -non-descriptor expression in a context that requires a descriptor -result: -\begin{itemize} - -\item Assignment to or initialization of any data structure other than - a specialized array (\pxlref{specialized-array-types}), or - -\item Assignment to a \code{special} variable, or - -\item Passing as an argument or returning as a value in any function - call that is not a local call (\pxlref{number-local-call}.) -\end{itemize} - -If such inefficient coercions appear in a ``hot spot'' in the program, data -structures redesign or program reorganization may be necessary to improve -efficiency. See sections \ref{block-compilation}, \ref{numeric-types} and -\ref{profiling}. - -Because representation selection is done rather late in compilation, -the source context in these efficiency notes is somewhat vague, making -interpretation more difficult. This is a fairly straightforward -example: -\begin{lisp} -(defun cf+ (x y) - (declare (single-float x y)) - (cons (+ x y) t)) -\end{lisp} -which gives this efficiency note: -\begin{example} -In: DEFUN CF+ - (CONS (+ X Y) T) -Note: Doing float to pointer coercion (cost 13), for: - The first argument of CONS. -\end{example} -The source context form is almost always the form that receives the value being -coerced (as it is in the preceding example), but can also be the source form -which generates the coerced value. Compiling this example: -\begin{lisp} -(defun if-cf+ (x y) - (declare (single-float x y)) - (cons (if (grue) (+ x y) (snoc)) t)) -\end{lisp} -produces this note: -\begin{example} -In: DEFUN IF-CF+ - (+ X Y) -Note: Doing float to pointer coercion (cost 13). -\end{example} - -In either case, the note's text explanation attempts to include -additional information about what locations are the source and -destination of the coercion. Here are some example notes: -\begin{example} - (IF (GRUE) X (SNOC)) -Note: Doing float to pointer coercion (cost 13) from X. - - (SETQ VAR X) -Note: Doing float to pointer coercion (cost 13) from X to VAR. -\end{example} -Note that the return value of a function is also a place to which coercions may -have to be done: -\begin{example} - (DEFUN F+ (X Y) (DECLARE (SINGLE-FLOAT X Y)) (+ X Y)) -Note: Doing float to pointer coercion (cost 13) to "". -\end{example} -Sometimes the compiler is unable to determine a name for the source or -destination, in which case the source context is the only clue. - - -%%\node Verbosity Control, , Representation Efficiency Notes, Efficiency Notes -\subsection{Verbosity Control} -\cpsubindex{verbosity}{of efficiency notes} -\cpsubindex{efficiency notes}{verbosity} - -These variables control the verbosity of efficiency notes: - -\begin{defvar}{}{efficiency-note-cost-threshold} - - Before printing some efficiency notes, the compiler compares the - value of this variable to the difference in cost between the chosen - implementation and the best potential implementation. If the - difference is not greater than this limit, then no note is printed. - The units are implementation dependent; the initial value suppresses - notes about ``trivial'' inefficiencies. A value of \code{1} will - note any inefficiency. -\end{defvar} - -\begin{defvar}{}{efficiency-note-limit} - - When printing some efficiency notes, the compiler reports possible - efficient implementations. The initial value of \code{2} prevents - excessively long efficiency notes in the common case where there is - no type information, so all implementations are possible. -\end{defvar} - -%% -%%\node Profiling, , Efficiency Notes, Advanced Compiler Use and Efficiency Hints -\section{Profiling} - -\cindex{profiling} -\cindex{timing} -\cindex{consing} -\cindex{tuning} -\label{profiling} - -The first step in improving a program's performance is to profile the -activity of the program to find where it spends its time. The best -way to do this is to use the profiling utility found in the -\code{profile} package. This package provides a macro \code{profile} -that encapsulates functions with statistics gathering code. - -\begin{comment} -* Profile Interface:: -* Profiling Techniques:: -* Nested or Recursive Calls:: -* Clock resolution:: -* Profiling overhead:: -* Additional Timing Utilities:: -* A Note on Timing:: -* Benchmarking Techniques:: -\end{comment} - -%%\node Profile Interface, Profiling Techniques, Profiling, Profiling -\subsection{Profile Interface} - -\begin{defvar}{profile:}{timed-functions} - - This variable holds a list of all functions that are currently being - profiled. -\end{defvar} - -\begin{defmac}{profile:}{profile}{% - \args{\mstar{\var{name} \mor \kwd{callers} \code{t}}}} - - This macro wraps profiling code around the named functions. As in - \code{trace}, the \var{name}s are not evaluated. If a function is - already profiled, then the function is unprofiled and reprofiled - (useful to notice function redefinition.) A warning is printed for - each name that is not a defined function. - - If \kwd{callers \var{t}} is specified, then each function that calls - this function is recorded along with the number of calls made. -\end{defmac} - -\begin{defmac}{profile:}{unprofile}{% - \args{\mstar{\var{name}}}} - - This macro removes profiling code from the named functions. If no - \var{name}s are supplied, all currently profiled functions are - unprofiled. -\end{defmac} - -\begin{changebar} - \begin{defmac}{profile:}{profile-all}{% - \args{\keys{\kwd{package} \kwd{callers-p}}}} - - This macro in effect calls \code{profile:profile} for each - function in the specified package which defaults to - \code{*package*}. \kwd{callers-p} has the same meaning as in - \code{profile:profile}. - \end{defmac} -\end{changebar} - -\begin{defmac}{profile:}{report-time}{\args{\mstar{\var{name}}}} - - This macro prints a report for each \var{name}d function of the - following information: - \begin{itemize} - \item The total CPU time used in that function for all calls, - - \item the total number of bytes consed in that function for all - calls, - - \item the total number of calls, - - \item the average amount of CPU time per call. - \end{itemize} - Summary totals of the CPU time, consing and calls columns are - printed. An estimate of the profiling overhead is also printed (see - below). If no \var{name}s are supplied, then the times for all - currently profiled functions are printed. -\end{defmac} - -\begin{defmac}{}{reset-time}{\args{\mstar{\var{name}}}} - - This macro resets the profiling counters associated with the - \var{name}d functions. If no \var{name}s are supplied, then all - currently profiled functions are reset. -\end{defmac} - - -%%\node Profiling Techniques, Nested or Recursive Calls, Profile Interface, Profiling -\subsection{Profiling Techniques} - -Start by profiling big pieces of a program, then carefully choose which -functions close to, but not in, the inner loop are to be profiled next. -Avoid profiling functions that are called by other profiled functions, since -this opens the possibility of profiling overhead being included in the reported -times. - -If the per-call time reported is less than 1/10 second, then consider the clock -resolution and profiling overhead before you believe the time. It may be that -you will need to run your program many times in order to average out to a -higher resolution. - - -%%\node Nested or Recursive Calls, Clock resolution, Profiling Techniques, Profiling -\subsection{Nested or Recursive Calls} - -The profiler attempts to compensate for nested or recursive calls. Time and -consing overhead will be charged to the dynamically innermost (most recent) -call to a profiled function. So profiling a subfunction of a profiled function -will cause the reported time for the outer function to decrease. However if an -inner function has a large number of calls, some of the profiling overhead may -``leak'' into the reported time for the outer function. In general, be wary of -profiling short functions that are called many times. - -%%\node Clock resolution, Profiling overhead, Nested or Recursive Calls, Profiling -\subsection{Clock resolution} - -Unless you are very lucky, the length of your machine's clock ``tick'' is -probably much longer than the time it takes simple function to run. For -example, on the IBM RT, the clock resolution is 1/50 second. This means that -if a function is only called a few times, then only the first couple decimal -places are really meaningful. - -Note however, that if a function is called many times, then the statistical -averaging across all calls should result in increased resolution. For example, -on the IBM RT, if a function is called a thousand times, then a resolution of -tens of microseconds can be expected. - -%%\node Profiling overhead, Additional Timing Utilities, Clock resolution, Profiling -\subsection{Profiling overhead} - -The added profiling code takes time to run every time that the profiled -function is called, which can disrupt the attempt to collect timing -information. In order to avoid serious inflation of the times for functions -that take little time to run, an estimate of the overhead due to profiling is -subtracted from the times reported for each function. - -Although this correction works fairly well, it is not totally accurate, -resulting in times that become increasingly meaningless for functions with -short runtimes. This is only a concern when the estimated profiling overhead -is many times larger than reported total CPU time. - -The estimated profiling overhead is not represented in the reported total CPU -time. The sum of total CPU time and the estimated profiling overhead should be -close to the total CPU time for the entire profiling run (as determined by the -\code{time} macro.) Time unaccounted for is probably being used by functions that -you forgot to profile. - -%%\node Additional Timing Utilities, A Note on Timing, Profiling overhead, Profiling -\subsection{Additional Timing Utilities} - -\begin{defmac}{}{time}{ \args{\var{form}}} - - This macro evaluates \var{form}, prints some timing and memory - allocation information to \code{*trace-output*}, and returns any - values that \var{form} returns. The timing information includes - real time, user run time, and system run time. This macro executes - a form and reports the time and consing overhead. If the - \code{time} form is not compiled (e.g. it was typed at top-level), - then \code{compile} will be called on the form to give more accurate - timing information. If you really want to time interpreted speed, - you can say: -\begin{lisp} -(time (eval '\var{form})) -\end{lisp} -Things that execute fairly quickly should be timed more than once, -since there may be more paging overhead in the first timing. To -increase the accuracy of very short times, you can time multiple -evaluations: -\begin{lisp} -(time (dotimes (i 100) \var{form})) -\end{lisp} -\end{defmac} - -\begin{defun}{extensions:}{get-bytes-consed}{} - - This function returns the number of bytes allocated since the first - time you called it. The first time it is called it returns zero. - The above profiling routines use this to report consing information. -\end{defun} - -\begin{defvar}{extensions:}{gc-run-time} - - This variable accumulates the run-time consumed by garbage - collection, in the units returned by - \findexed{get-internal-run-time}. -\end{defvar} - -\begin{defconst}{}{internal-time-units-per-second} -The value of internal-time-units-per-second is 100. -\end{defconst} - -%%\node A Note on Timing, Benchmarking Techniques, Additional Timing Utilities, Profiling -\subsection{A Note on Timing} -\cpsubindex{CPU time}{interpretation of} -\cpsubindex{run time}{interpretation of} -\cindex{interpretation of run time} - -There are two general kinds of timing information provided by the -\code{time} macro and other profiling utilities: real time and run -time. Real time is elapsed, wall clock time. It will be affected in -a fairly obvious way by any other activity on the machine. The more -other processes contending for CPU and memory, the more real time will -increase. This means that real time measurements are difficult to -replicate, though this is less true on a dedicated workstation. The -advantage of real time is that it is real. It tells you really how -long the program took to run under the benchmarking conditions. The -problem is that you don't know exactly what those conditions were. - -Run time is the amount of time that the processor supposedly spent -running the program, as opposed to waiting for I/O or running other -processes. ``User run time'' and ``system run time'' are numbers -reported by the Unix kernel. They are supposed to be a measure of how -much time the processor spent running your ``user'' program (which -will include GC overhead, etc.), and the amount of time that the -kernel spent running ``on your behalf.'' - -Ideally, user time should be totally unaffected by benchmarking -conditions; in reality user time does depend on other system activity, -though in rather non-obvious ways. - -System time will clearly depend on benchmarking conditions. In Lisp -benchmarking, paging activity increases system run time (but not by as much -as it increases real time, since the kernel spends some time waiting for -the disk, and this is not run time, kernel or otherwise.) - -In my experience, the biggest trap in interpreting kernel/user run time is -to look only at user time. In reality, it seems that the \var{sum} of kernel -and user time is more reproducible. The problem is that as system activity -increases, there is a spurious \var{decrease} in user run time. In effect, as -paging, etc., increases, user time leaks into system time. - -So, in practice, the only way to get truly reproducible results is to run -with the same competing activity on the system. Try to run on a machine -with nobody else logged in, and check with ``ps aux'' to see if there are any -system processes munching large amounts of CPU or memory. If the ratio -between real time and the sum of user and system time varies much between -runs, then you have a problem. - -%%\node Benchmarking Techniques, , A Note on Timing, Profiling -\subsection{Benchmarking Techniques} -\cindex{benchmarking techniques} - -Given these imperfect timing tools, how do should you do benchmarking? The -answer depends on whether you are trying to measure improvements in the -performance of a single program on the same hardware, or if you are trying to -compare the performance of different programs and/or different hardware. - -For the first use (measuring the effect of program modifications with -constant hardware), you should look at \var{both} system+user and real time to -understand what effect the change had on CPU use, and on I/O (including -paging.) If you are working on a CPU intensive program, the change in -system+user time will give you a moderately reproducible measure of -performance across a fairly wide range of system conditions. For a CPU -intensive program, you can think of system+user as ``how long it would have -taken to run if I had my own machine.'' So in the case of comparing CPU -intensive programs, system+user time is relatively real, and reasonable to -use. - -For programs that spend a substantial amount of their time paging, you -really can't predict elapsed time under a given operating condition without -benchmarking in that condition. User or system+user time may be fairly -reproducible, but it is also relatively meaningless, since in a paging or -I/O intensive program, the program is spending its time waiting, not -running, and system time and user time are both measures of run time. -A change that reduces run time might increase real time by increasing -paging. - -Another common use for benchmarking is comparing the performance of -the same program on different hardware. You want to know which -machine to run your program on. For comparing different machines -(operating systems, etc.), the only way to compare that makes sense is -to set up the machines in \var{exactly} the way that they will -\var{normally} be run, and then measure \var{real} time. If the -program will normally be run along with X, then run X. If the program -will normally be run on a dedicated workstation, then be sure nobody -else is on the benchmarking machine. If the program will normally be -run on a machine with three other Lisp jobs, then run three other Lisp -jobs. If the program will normally be run on a machine with 8meg of -memory, then run with 8meg. Here, ``normal'' means ``normal for that -machine''. If you the choice of an unloaded RT or a heavily loaded -PMAX, do your benchmarking on an unloaded RT and a heavily loaded -PMAX. - -If you have a program you believe to be CPU intensive, then you might be -tempted to compare ``run'' times across systems, hoping to get a meaningful -result even if the benchmarking isn't done under the expected running -condition. Don't to this, for two reasons: -\begin{itemize} - -\item The operating systems might not compute run time in the same - way. - -\item Under the real running condition, the program might not be CPU - intensive after all. -\end{itemize} - - -In the end, only real time means anything\dash{}it is the amount of time you -have to wait for the result. The only valid uses for run time are: -\begin{itemize} - -\item To develop insight into the program. For example, if run time - is much less than elapsed time, then you are probably spending lots - of time paging. - -\item To evaluate the relative performance of CPU intensive programs - in the same environment. -\end{itemize} - - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/Unix.ms} - - - -%%\node UNIX Interface, Event Dispatching with SERVE-EVENT, Advanced Compiler Use and Efficiency Hints, Top -\chapter{UNIX Interface} -\label{unix-interface} -\begin{center} -\b{By Robert MacLachlan, Skef Wholey,} -\end{center} -\begin{center} -\b{Bill Chiles, and William Lott} -\end{center} - -CMU Common Lisp attempts to make the full power of the underlying -environment available to the Lisp programmer. This is done using -combination of hand-coded interfaces and foreign function calls to C -libraries. Although the techniques differ, the style of interface is -similar. This chapter provides an overview of the facilities -available and general rules for using them, as well as describing -specific features in detail. It is assumed that the reader has a -working familiarity with Mach, Unix and X, as well as access to the -standard system documentation. - -\begin{comment} -* Reading the Command Line:: -* Lisp Equivalents for C Routines:: -* Type Translations:: -* System Area Pointers:: -* Unix System Calls:: -* File Descriptor Streams:: -* Making Sense of Mach Return Codes:: -* Unix Interrupts:: -\end{comment} - - -%%\node Reading the Command Line, Useful Variables, UNIX Interface, UNIX Interface -\section{Reading the Command Line} - -The shell parses the command line with which Lisp is invoked, and -passes a data structure containing the parsed information to Lisp. -This information is then extracted from that data structure and put -into a set of Lisp data structures. - -\begin{defvar}{extensions:}{command-line-strings} - \defvarx[extensions:]{command-line-utility-name} - \defvarx[extensions:]{command-line-words} - \defvarx[extensions:]{command-line-switches} - - The value of \code{*command-line-words*} is a list of strings that - make up the command line, one word per string. The first word on - the command line, i.e. the name of the program invoked (usually - \code{lisp}) is stored in \code{*command-line-utility-name*}. The - value of \code{*command-line-switches*} is a list of - \code{command-line-switch} structures, with a structure for each - word on the command line starting with a hyphen. All the command - line words between the program name and the first switch are stored - in \code{*command-line-words*}. -\end{defvar} - -The following functions may be used to examine \code{command-line-switch} -structures. -\begin{defun}{extensions:}{cmd-switch-name}{\args{\var{switch}}} - - Returns the name of the switch, less the preceding hyphen and - trailing equal sign (if any). -\end{defun} -\begin{defun}{extensions:}{cmd-switch-value}{\args{\var{switch}}} - - Returns the value designated using an embedded equal sign, if any. - If the switch has no equal sign, then this is null. -\end{defun} -\begin{defun}{extensions:}{cmd-switch-words}{\args{\var{switch}}} - - Returns a list of the words between this switch and the next switch - or the end of the command line. -\end{defun} -\begin{defun}{extensions:}{cmd-switch-arg}{\args{\var{switch}}} - - Returns the first non-null value from \code{cmd-switch-value}, the - first element in \code{cmd-switch-words}, or the first word in - \var{command-line-words}. -\end{defun} - -\begin{defun}{extensions:}{get-command-line-switch}{\args{\var{sname}}} - - This function takes the name of a switch as a string and returns the - value of the switch given on the command line. If no value was - specified, then any following words are returned. If there are no - following words, then \true{} is returned. If the switch was not - specified, then \false{} is returned. -\end{defun} - -\begin{defmac}{extensions:}{defswitch}{% - \args{\var{name} \ampoptional{} \var{function}}} - - This macro causes \var{function} to be called when the switch - \var{name} appears in the command line. Name is a simple-string - that does not begin with a hyphen (unless the switch name really - does begin with one.) - - If \var{function} is not supplied, then the switch is parsed into - \var{command-line-switches}, but otherwise ignored. This suppresses - the undefined switch warning which would otherwise take place. THe - warning can also be globally suppressed by - \var{complain-about-illegal-switches}. -\end{defmac} - -%%\node Useful Variables, Lisp Equivalents for C Routines, Reading the Command Line, UNIX Interface - -\section{Useful Variables} - -\begin{defvar}{system:}{stdin} - \defvarx[system:]{stdout} \defvarx[system:]{stderr} - - Streams connected to the standard input, output and error file - descriptors. -\end{defvar} - -\begin{defvar}{system:}{tty} - - A stream connected to \file{/dev/tty}. -\end{defvar} - -%%\node Lisp Equivalents for C Routines, Type Translations, Useful Variables, UNIX Interface -\section{Lisp Equivalents for C Routines} - -The UNIX documentation describes the system interface in terms of C -procedure headers. The corresponding Lisp function will have a somewhat -different interface, since Lisp argument passing conventions and -datatypes are different. - -The main difference in the argument passing conventions is that Lisp does not -support passing values by reference. In Lisp, all argument and results are -passed by value. Interface functions take some fixed number of arguments and -return some fixed number of values. A given ``parameter'' in the C -specification will appear as an argument, return value, or both, depending on -whether it is an In parameter, Out parameter, or In/Out parameter. The basic -transformation one makes to come up with the Lisp equivalent of a C routine is -to remove the Out parameters from the call, and treat them as extra return -values. In/Out parameters appear both as arguments and return values. Since -Out and In/Out parameters are only conventions in C, you must determine the -usage from the documentation. - - -Thus, the C routine declared as -\begin{example} -kern_return_t lookup(servport, portsname, portsid) - port servport; - char *portsname; - int *portsid; /* out */ - { - ... - *portsid = - return(KERN_SUCCESS); - } -\end{example} -has as its Lisp equivalent something like -\begin{lisp} -(defun lookup (ServPort PortsName) - ... - (values - success - )) -\end{lisp} -If there are multiple out or in-out arguments, then there are multiple -additional returns values. - -Fortunately, CMU Common Lisp programmers rarely have to worry about the -nuances of this translation process, since the names of the arguments and -return values are documented in a way so that the \code{describe} function -(and the \Hemlock{} \code{Describe Function Call} command, invoked with -\b{C-M-Shift-A}) will list this information. Since the names of arguments -and return values are usually descriptive, the information that -\code{describe} prints is usually all one needs to write a -call. Most programmers use this on-line documentation nearly -all of the time, and thereby avoid the need to handle bulky -manuals and perform the translation from barbarous tongues. - -%%\node Type Translations, System Area Pointers, Lisp Equivalents for C Routines, UNIX Interface -\section{Type Translations} -\cindex{aliens} -\cpsubindex{types}{alien} -\cpsubindex{types}{foreign language} - -Lisp data types have very different representations from those used by -conventional languages such as C. Since the system interfaces are -designed for conventional languages, Lisp must translate objects to and -from the Lisp representations. Many simple objects have a direct -translation: integers, characters, strings and floating point numbers -are translated to the corresponding Lisp object. A number of types, -however, are implemented differently in Lisp for reasons of clarity and -efficiency. - -Instances of enumerated types are expressed as keywords in Lisp. -Records, arrays, and pointer types are implemented with the \Alien{} -facility (see page \pageref{aliens}.) Access functions are defined -for these types which convert fields of records, elements of arrays, -or data referenced by pointers into Lisp objects (possibly another -object to be referenced with another access function). - -One should dispose of \Alien{} objects created by constructor -functions or returned from remote procedure calls when they are no -longer of any use, freeing the virtual memory associated with that -object. Since \alien{}s contain pointers to non-Lisp data, the -garbage collector cannot do this itself. If the memory -was obtained from \funref{make-alien} or from a foreign function call -to a routine that used \code{malloc}, then \funref{free-alien} should -be used. If the \alien{} was created -using MACH memory allocation (e.g. \code{vm\_allocate}), then the -storage should be freed using \code{vm\_deallocate}. - -%%\node System Area Pointers, Unix System Calls, Type Translations, UNIX Interface -\section{System Area Pointers} -\label{system-area-pointers} - -\cindex{pointers}\cpsubindex{malloc}{C function}\cpsubindex{free}{C function} -Note that in some cases an address is represented by a Lisp integer, and in -other cases it is represented by a real pointer. Pointers are usually used -when an object in the current address space is being referred to. The MACH -virtual memory manipulation calls must use integers, since in principle the -address could be in any process, and Lisp cannot abide random pointers. -Because these types are represented differently in Lisp, one must explicitly -coerce between these representations. - -System Area Pointers (SAPs) provide a mechanism that bypasses the -\Alien{} type system and accesses virtual memory directly. A SAP is a -raw byte pointer into the \code{lisp} process address space. SAPs are -represented with a pointer descriptor, so SAP creation can cause -consing. However, the compiler uses a non-descriptor representation -for SAPs when possible, so the consing overhead is generally minimal. -\xlref{non-descriptor}. - -\begin{defun}{system:}{sap-int}{\args{\var{sap}}} - \defunx[system:]{int-sap}{\args{\var{int}}} - - The function \code{sap-int} is used to generate an integer - corresponding to the system area pointer, suitable for passing to - the kernel interfaces (which want all addresses specified as - integers). The function \code{int-sap} is used to do the opposite - conversion. The integer representation of a SAP is the byte offset - of the SAP from the start of the address space. -\end{defun} - -\begin{defun}{system:}{sap+}{\args{\var{sap} \var{offset}}} - - This function adds a byte \var{offset} to \var{sap}, returning a new - SAP. -\end{defun} - -\begin{defun}{system:}{sap-ref-8}{\args{\var{sap} \var{offset}}} - \defunx[system:]{sap-ref-16}{\args{\var{sap} \var{offset}}} - \defunx[system:]{sap-ref-32}{\args{\var{sap} \var{offset}}} - - These functions return the 8, 16 or 32 bit unsigned integer at - \var{offset} from \var{sap}. The \var{offset} is always a byte - offset, regardless of the number of bits accessed. \code{setf} may - be used with the these functions to deposit values into virtual - memory. -\end{defun} - -\begin{defun}{system:}{signed-sap-ref-8}{\args{\var{sap} \var{offset}}} - \defunx[system:]{signed-sap-ref-16}{\args{\var{sap} \var{offset}}} - \defunx[system:]{signed-sap-ref-32}{\args{\var{sap} \var{offset}}} - - These functions are the same as the above unsigned operations, - except that they sign-extend, returning a negative number if the - high bit is set. -\end{defun} - -%%\node Unix System Calls, File Descriptor Streams, System Area Pointers, UNIX Interface -\section{Unix System Calls} - -You probably won't have much cause to use them, but all the Unix system -calls are available. The Unix system call functions are in the -\code{Unix} package. The name of the interface for a particular system -call is the name of the system call prepended with \code{unix-}. The -system usually defines the associated constants without any prefix name. -To find out how to use a particular system call, try using -\code{describe} on it. If that is unhelpful, look at the source in -\file{syscall.lisp} or consult your system maintainer. - -The Unix system calls indicate an error by returning \false{} as the -first value and the Unix error number as the second value. If the call -succeeds, then the first value will always be non-\nil, often \code{t}. - -\begin{defun}{Unix:}{get-unix-error-msg}{\args{\var{error}}} - - This function returns a string describing the Unix error number - \var{error}. -\end{defun} - -%%\node File Descriptor Streams, Making Sense of Mach Return Codes, Unix System Calls, UNIX Interface -\section{File Descriptor Streams} - -Many of the UNIX system calls return file descriptors. Instead of using other -UNIX system calls to perform I/O on them, you can create a stream around them. -For this purpose, fd-streams exist. See also \funref{read-n-bytes}. - -\begin{defun}{system:}{make-fd-stream}{% - \args{\var{descriptor}} \keys{\kwd{input} \kwd{output} - \kwd{element-type}} \morekeys{\kwd{buffering} \kwd{name} - \kwd{file} \kwd{original}} \yetmorekeys{\kwd{delete-original} - \kwd{auto-close}} \yetmorekeys{\kwd{timeout} \kwd{pathname}}} - - This function creates a file descriptor stream using - \var{descriptor}. If \kwd{input} is non-\nil, input operations are - allowed. If \kwd{output} is non-\nil, output operations are - allowed. The default is input only. These keywords are defined: - \begin{Lentry} - \item[\kwd{element-type}] is the type of the unit of transaction for - the stream, which defaults to \code{string-char}. See the Common - Lisp description of \code{open} for valid values. - - \item[\kwd{buffering}] is the kind of output buffering desired for - the stream. Legal values are \kwd{none} for no buffering, - \kwd{line} for buffering up to each newline, and \kwd{full} for - full buffering. - - \item[\kwd{name}] is a simple-string name to use for descriptive - purposes when the system prints an fd-stream. When printing - fd-streams, the system prepends the streams name with \code{Stream - for }. If \var{name} is unspecified, it defaults to a string - containing \var{file} or \var{descriptor}, in order of preference. - - \item[\kwd{file}, \kwd{original}] \var{file} specifies the defaulted - namestring of the associated file when creating a file stream - (must be a \code{simple-string}). \var{original} is the - \code{simple-string} name of a backup file containing the original - contents of \var{file} while writing \var{file}. - - When you abort the stream by passing \true{} to \code{close} as - the second argument, if you supplied both \var{file} and - \var{original}, \code{close} will rename the \var{original} name - to the \var{file} name. When you \code{close} the stream - normally, if you supplied \var{original}, and - \var{delete-original} is non-\nil, \code{close} deletes - \var{original}. If \var{auto-close} is true (the default), then - \var{descriptor} will be closed when the stream is garbage - collected. - - \item[\kwd{pathname}]: The original pathname passed to open and - returned by \code{pathname}; not defaulted or translated. - - \item[\kwd{timeout}] if non-null, then \var{timeout} is an integer - number of seconds after which an input wait should time out. If a - read does time out, then the \code{system:io-timeout} condition is - signalled. - \end{Lentry} -\end{defun} - -\begin{defun}{system:}{fd-stream-p}{\args{\var{object}}} - - This function returns \true{} if \var{object} is an fd-stream, and - \nil{} if not. Obsolete: use the portable \code{(typep x - 'file-stream)}. -\end{defun} - -\begin{defun}{system:}{fd-stream-fd}{\args{\var{stream}}} - - This returns the file descriptor associated with \var{stream}. -\end{defun} - - -%%\node Making Sense of Mach Return Codes, Unix Interrupts, File Descriptor Streams, UNIX Interface -\section{Making Sense of Mach Return Codes} - -Whenever a remote procedure call returns a Unix error code (such as -\code{kern\_return\_t}), it is usually prudent to check that code to -see if the call was successful. To relieve the programmer of the -hassle of testing this value himself, and to centralize the -information about the meaning of non-success return codes, CMU Common -Lisp provides a number of macros and functions. See also -\funref{get-unix-error-msg}. - -\begin{defun}{system:}{gr-error}{% - \args{\var{function} \var{gr} \ampoptional{} \var{context}}} - - Signals a Lisp error, printing a message indicating that the call to - the specified \var{function} failed, with the return code \var{gr}. - If supplied, the \var{context} string is printed after the - \var{function} name and before the string associated with the - \var{gr}. For example: -\begin{example} -* (gr-error 'nukegarbage 3 "lost big") - -Error in function GR-ERROR: -NUKEGARBAGE lost big, no space. -Proceed cases: -0: Return to Top-Level. -Debug (type H for help) -(Signal #) -0] -\end{example} -\end{defun} - -\begin{defmac}{system:}{gr-call}{\args{\var{function} \amprest{} \var{args}}} - \defmacx[system:]{gr-call*}{\args{\var{function} \amprest{} \var{args}}} - - These macros can be used to call a function and automatically check - the GeneralReturn code and signal an appropriate error in case of - non-successful return. \code{gr-call} returns \false{} if no error - occurs, while \code{gr-call*} returns the second value of the - function called. -\begin{example} -* (gr-call mach:port_allocate *task-self*) -NIL -* -\end{example} -\end{defmac} - -\begin{defmac}{system:}{gr-bind}{ - \args{\code{(}\mstar{\var{var}}\code{)} - \code{(}\var{function} \mstar{\var{arg}}\code{)} - \mstar{\var{form}}}} - - This macro can be used much like \code{multiple-value-bind} to bind - the \var{var}s to return values resulting from calling the - \var{function} with the given \var{arg}s. The first return value is - not bound to a variable, but is checked as a GeneralReturn code, as - in \code{gr-call}. -\begin{example} -* (gr-bind (port_list port_list_cnt) - (mach:port_select *task-self*) - (format t "The port count is ~S." port_list_cnt) - port_list) -The port count is 0. -# -* -\end{example} -\end{defmac} - -%%\node Unix Interrupts, , Making Sense of Mach Return Codes, UNIX Interface -\section{Unix Interrupts} - -\cindex{unix interrupts} \cindex{interrupts} -CMU Common Lisp allows access to all the Unix signals that can be generated -under Unix. It should be noted that if this capability is abused, it is -possible to completely destroy the running Lisp. The following macros and -functions allow access to the Unix interrupt system. The signal names as -specified in section 2 of the \i{Unix Programmer's Manual} are exported -from the Unix package. - -\begin{comment} -* Changing Interrupt Handlers:: -* Examples of Signal Handlers:: -\end{comment} - -%%\node Changing Interrupt Handlers, Examples of Signal Handlers, Unix Interrupts, Unix Interrupts -\subsection{Changing Interrupt Handlers} -\label{signal-handlers} - -\begin{defmac}{system:}{with-enabled-interrupts}{ - \args{\var{specs} \amprest{} \var{body}}} - - This macro should be called with a list of signal specifications, - \var{specs}. Each element of \var{specs} should be a list of - two\hide{ or three} elements: the first should be the Unix signal - for which a handler should be established, the second should be a - function to be called when the signal is received\hide{, and the - third should be an optional character used to generate the signal - from the keyboard. This last item is only useful for the SIGINT, - SIGQUIT, and SIGTSTP signals.} One or more signal handlers can be - established in this way. \code{with-enabled-interrupts} establishes - the correct signal handlers and then executes the forms in - \var{body}. The forms are executed in an unwind-protect so that the - state of the signal handlers will be restored to what it was before - the \code{with-enabled-interrupts} was entered. A signal handler - function specified as NIL will set the Unix signal handler to the - default which is normally either to ignore the signal or to cause a - core dump depending on the particular signal. -\end{defmac} - -\begin{defmac}{system:}{without-interrupts}{\args{\amprest{} \var{body}}} - - It is sometimes necessary to execute a piece a code that can not be - interrupted. This macro the forms in \var{body} with interrupts - disabled. Note that the Unix interrupts are not actually disabled, - rather they are queued until after \var{body} has finished - executing. -\end{defmac} - -\begin{defmac}{system:}{with-interrupts}{\args{\amprest{} \var{body}}} - - When executing an interrupt handler, the system disables interrupts, - as if the handler was wrapped in in a \code{without-interrupts}. - The macro \code{with-interrupts} can be used to enable interrupts - while the forms in \var{body} are evaluated. This is useful if - \var{body} is going to enter a break loop or do some long - computation that might need to be interrupted. -\end{defmac} - -\begin{defmac}{system:}{without-hemlock}{\args{\amprest{} \var{body}}} - - For some interrupts, such as SIGTSTP (suspend the Lisp process and - return to the Unix shell) it is necessary to leave Hemlock and then - return to it. This macro executes the forms in \var{body} after - exiting Hemlock. When \var{body} has been executed, control is - returned to Hemlock. -\end{defmac} - -\begin{defun}{system:}{enable-interrupt}{% - \args{\var{signal} \var{function}\hide{ \ampoptional{} - \var{character}}}} - - This function establishes \var{function} as the handler for - \var{signal}. - \hide{The optional \var{character} can be specified - for the SIGINT, SIGQUIT, and SIGTSTP signals and causes that - character to generate the appropriate signal from the keyboard.} - Unless you want to establish a global signal handler, you should use - the macro \code{with-enabled-interrupts} to temporarily establish a - signal handler. \hide{Without \var{character},} - \code{enable-interrupt} returns the old function associated with the - signal. \hide{When \var{character} is specified for SIGINT, - SIGQUIT, or SIGTSTP, it returns the old character code.} -\end{defun} - -\begin{defun}{system:}{ignore-interrupt}{\args{\var{signal}}} - - Ignore-interrupt sets the Unix signal mechanism to ignore - \var{signal} which means that the Lisp process will never see the - signal. Ignore-interrupt returns the old function associated with - the signal or \false{} if none is currently defined. -\end{defun} - -\begin{defun}{system:}{default-interrupt}{\args{\var{signal}}} - - Default-interrupt can be used to tell the Unix signal mechanism to - perform the default action for \var{signal}. For details on what - the default action for a signal is, see section 2 of the \i{Unix - Programmer's Manual}. In general, it is likely to ignore the - signal or to cause a core dump. -\end{defun} - -%%\node Examples of Signal Handlers, , Changing Interrupt Handlers, Unix Interrupts -\subsection{Examples of Signal Handlers} - -The following code is the signal handler used by the Lisp system for the -SIGINT signal. -\begin{lisp} -(defun ih-sigint (signal code scp) - (declare (ignore signal code scp)) - (without-hemlock - (with-interrupts - (break "Software Interrupt" t)))) -\end{lisp} -The \code{without-hemlock} form is used to make sure that Hemlock is exited before -a break loop is entered. The \code{with-interrupts} form is used to enable -interrupts because the user may want to generate an interrupt while in the -break loop. Finally, break is called to enter a break loop, so the user -can look at the current state of the computation. If the user proceeds -from the break loop, the computation will be restarted from where it was -interrupted. - -The following function is the Lisp signal handler for the SIGTSTP signal -which suspends a process and returns to the Unix shell. -\begin{lisp} -(defun ih-sigtstp (signal code scp) - (declare (ignore signal code scp)) - (without-hemlock - (Unix:unix-kill (Unix:unix-getpid) Unix:sigstop))) -\end{lisp} -Lisp uses this interrupt handler to catch the SIGTSTP signal because it is -necessary to get out of Hemlock in a clean way before returning to the shell. - -To set up these interrupt handlers, the following is recommended: -\begin{lisp} -(with-enabled-interrupts ((Unix:SIGINT #'ih-sigint) - (Unix:SIGTSTP #'ih-sigtstp)) - -) -\end{lisp} - - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/server.ms} - -%%\node Event Dispatching with SERVE-EVENT, Alien Objects, UNIX Interface, Top -\chapter{Event Dispatching with SERVE-EVENT} -\begin{center} -\b{By Bill Chiles and Robert MacLachlan} -\end{center} - -It is common to have multiple activities simultaneously operating in the same -Lisp process. Furthermore, Lisp programmers tend to expect a flexible -development environment. It must be possible to load and modify application -programs without requiring modifications to other running programs. CMU Common -Lisp achieves this by having a central scheduling mechanism based on an -event-driven, object-oriented paradigm. - -An \var{event} is some interesting happening that should cause the Lisp process -to wake up and do something. These events include X events and activity on -Unix file descriptors. The object-oriented mechanism is only available with -the first two, and it is optional with X events as described later in this -chapter. In an X event, the window ID is the object capability and the X event -type is the operation code. The Unix file descriptor input mechanism simply -consists of an association list of a handler to call when input shows up on a -particular file descriptor. - - -\begin{comment} -* Object Sets:: -* The SERVE-EVENT Function:: -* Using SERVE-EVENT with Unix File Descriptors:: -* Using SERVE-EVENT with the CLX Interface to X:: -* A SERVE-EVENT Example:: -\end{comment} - -%%\node Object Sets, The SERVE-EVENT Function, Event Dispatching with SERVE-EVENT, Event Dispatching with SERVE-EVENT -\section{Object Sets} -\label{object-sets} -\cindex{object sets} -An \i{object set} is a collection of objects that have the same implementation -for each operation. Externally the object is represented by the object -capability and the operation is represented by the operation code. Within -Lisp, the object is represented by an arbitrary Lisp object, and the -implementation for the operation is represented by an arbitrary Lisp function. -The object set mechanism maintains this translation from the external to the -internal representation. - -\begin{defun}{system:}{make-object-set}{% - \args{\var{name} \ampoptional{} \var{default-handler}}} - - This function makes a new object set. \var{Name} is a string used - only for purposes of identifying the object set when it is printed. - \var{Default-handler} is the function used as a handler when an - undefined operation occurs on an object in the set. You can define - operations with the \code{serve-}\var{operation} functions exported - the \code{extensions} package for X events - (\pxlref{x-serve-mumbles}). Objects are added with - \code{system:add-xwindow-object}. Initially the object set has no - objects and no defined operations. -\end{defun} - -\begin{defun}{system:}{object-set-operation}{% - \args{\var{object-set} \var{operation-code}}} - - This function returns the handler function that is the - implementation of the operation corresponding to - \var{operation-code} in \var{object-set}. When set with - \code{setf}, the setter function establishes the new handler. The - \code{serve-}\var{operation} functions exported from the - \code{extensions} package for X events (\pxlref{x-serve-mumbles}) - call this on behalf of the user when announcing a new operation for - an object set. -\end{defun} - -\begin{defun}{system:}{add-xwindow-object}{% - \args{\var{window} \var{object} \var{object-set}}} - - These functions add \var{port} or \var{window} to \var{object-set}. - \var{Object} is an arbitrary Lisp object that is associated with the - \var{port} or \var{window} capability. \var{Window} is a CLX - window. When an event occurs, \code{system:serve-event} passes - \var{object} as an argument to the handler function. -\end{defun} - - -%%\node The SERVE-EVENT Function, Using SERVE-EVENT with Unix File Descriptors, Object Sets, Event Dispatching with SERVE-EVENT -\section{The SERVE-EVENT Function} - -The \code{system:serve-event} function is the standard way for an application -to wait for something to happen. For example, the Lisp system calls -\code{system:serve-event} when it wants input from X or a terminal stream. -The idea behind \code{system:serve-event} is that it knows the appropriate -action to take when any interesting event happens. If an application calls -\code{system:serve-event} when it is idle, then any other applications with -pending events can run. This allows several applications to run ``at the -same time'' without interference, even though there is only one thread of -control. Note that if an application is waiting for input of any kind, -then other applications will get events. - -\begin{defun}{system:}{serve-event}{\args{\ampoptional{} \var{timeout}}} - - This function waits for an event to happen and then dispatches to - the correct handler function. If specified, \var{timeout} is the - number of seconds to wait before timing out. A time out of zero - seconds is legal and causes \code{system:serve-event} to poll for - any events immediately available for processing. - \code{system:serve-event} returns \true{} if it serviced at least - one event, and \nil{} otherwise. Depending on the application, when - \code{system:serve-event} returns \true, you might want to call it - repeatedly with a timeout of zero until it returns \nil. - - If input is available on any designated file descriptor, then this - calls the appropriate handler function supplied by - \code{system:add-fd-handler}. - - Since events for many different applications may arrive - simultaneously, an application waiting for a specific event must - loop on \code{system:serve-event} until the desired event happens. - Since programs such as \hemlock{} call \code{system:serve-event} for - input, applications usually do not need to call - \code{system:serve-event} at all; \hemlock{} allows other - application's handlers to run when it goes into an input wait. -\end{defun} - -\begin{defun}{system:}{serve-all-events}{\args{\ampoptional{} \var{timeout}}} - - This function is similar to \code{system:serve-event}, except it - serves all the pending events rather than just one. It returns - \true{} if it serviced at least one event, and \nil{} otherwise. -\end{defun} - - -%%\node Using SERVE-EVENT with Unix File Descriptors, Using SERVE-EVENT with the CLX Interface to X, The SERVE-EVENT Function, Event Dispatching with SERVE-EVENT -\section{Using SERVE-EVENT with Unix File Descriptors} -Object sets are not available for use with file descriptors, as there are -only two operations possible on file descriptors: input and output. -Instead, a handler for either input or output can be registered with -\code{system:serve-event} for a specific file descriptor. Whenever any input -shows up, or output is possible on this file descriptor, the function -associated with the handler for that descriptor is funcalled with the -descriptor as it's single argument. - -\begin{defun}{system:}{add-fd-handler}{% - \args{\var{fd} \var{direction} \var{function}}} - - This function installs and returns a new handler for the file - descriptor \var{fd}. \var{direction} can be either \kwd{input} if - the system should invoke the handler when input is available or - \kwd{output} if the system should invoke the handler when output is - possible. This returns a unique object representing the handler, - and this is a suitable argument for \code{system:remove-fd-handler} - \var{function} must take one argument, the file descriptor. -\end{defun} - -\begin{defun}{system:}{remove-fd-handler}{\args{\var{handler}}} - - This function removes \var{handler}, that \code{add-fd-handler} must - have previously returned. -\end{defun} - -\begin{defmac}{system:}{with-fd-handler}{% - \args{(\var{direction} \var{fd} \var{function}) - \mstar{\var{form}}}} - - This macro executes the supplied forms with a handler installed - using \var{fd}, \var{direction}, and \var{function}. See - \code{system:add-fd-handler}. -\end{defmac} - -\begin{defun}{system:}{wait-until-fd-usable}{% - \args{\var{direction} \var{fd} \ampoptional{} \var{timeout}}} - - This function waits for up to \var{timeout} seconds for \var{fd} to - become usable for \var{direction} (either \kwd{input} or - \kwd{output}). If \var{timeout} is \nil{} or unspecified, this - waits forever. -\end{defun} - -\begin{defun}{system:}{invalidate-descriptor}{\args{\var{fd}}} - - This function removes all handlers associated with \var{fd}. This - should only be used in drastic cases (such as I/O errors, but not - necessarily EOF). Normally, you should use \code{remove-fd-handler} - to remove the specific handler. -\end{defun} - -\begin{comment} - -section{Using SERVE-EVENT with Matchmaker Interfaces} -\label{ipc-serve-mumbles} -Remember from section \ref{object-sets}, an object set is a collection of -objects, ports in this case, with some set of operations, message ID's, with -corresponding implementations, the same handler functions. - -Matchmaker uses the object set operations to implement servers. For -each server interface \i{XXX}, Matchmaker defines a function, -\code{serve-}\i{XXX}, of two arguments, an object set and a function. -The \code{serve-}\i{XXX} function establishes the function as the -implementation of the \i{XXX} operation in the object set. Recall -from section \ref{object-sets}, \code{system:add-port-object} -associates some Lisp object with a port in an object set. When -\code{system:serve-event} notices activity on a port, it calls the -function given to \code{serve-}\i{XXX} with the object given to -\code{system:add-port-object} and the input parameters specified in -the message definition. The return values from the function are used -as the output parameters for the message, if any. -\code{serve-}\i{XXX} functions are also generated for each \i{server - message} and asynchronous user interface. - -To use a Lisp server: -\begin{itemize} - -\item Create an object set. - -\item Define some operations on it using the \code{serve-}\i{XXX} - functions. - -\item Create an object for every port on which you receive requests. - -\item Call \code{system:serve-event} to service an RPC request. -\end{itemize} - - -Object sets allow many servers in the same Lisp to operate without knowing -about each other. There can be multiple implementations of the same interface -with different operation handlers established in distinct object sets. This -property is especially useful when handling emergency messages. - -\end{comment} - -%%\node Using SERVE-EVENT with the CLX Interface to X, A SERVE-EVENT Example, Using SERVE-EVENT with Unix File Descriptors, Event Dispatching with SERVE-EVENT -\section{Using SERVE-EVENT with the CLX Interface to X} -\label{x-serve-mumbles} -Remember from section \ref{object-sets}, an object set is a collection of -objects, CLX windows in this case, with some set of operations, event keywords, -with corresponding implementations, the same handler functions. Since X allows -multiple display connections from a given process, you can avoid using object -sets if every window in an application or display connection behaves the same. -If a particular X application on a single display connection has windows that -want to handle certain events differently, then using object sets is a -convenient way to organize this since you need some way to map the window/event -combination to the appropriate functionality. - -The following is a discussion of functions exported from the \code{extensions} -package that facilitate handling CLX events through \code{system:serve-event}. -The first two routines are useful regardless of whether you use -\code{system:serve-event}: -\begin{defun}{ext:}{open-clx-display}{% - \args{\ampoptional{} \var{string}}} - - This function parses \var{string} for an X display specification - including display and screen numbers. \var{String} defaults to the - following: - \begin{example} - (cdr (assoc :display ext:*environment-list* :test #'eq)) - \end{example} - If any field in the display specification is missing, this signals - an error. \code{ext:open-clx-display} returns the CLX display and - screen. -\end{defun} - -\begin{defun}{ext:}{flush-display-events}{\args{\var{display}}} - - This function flushes all the events in \var{display}'s event queue - including the current event, in case the user calls this from within - an event handler. -\end{defun} - - -\begin{comment} -* Without Object Sets:: -* With Object Sets:: -\end{comment} - -%%\node Without Object Sets, With Object Sets, Using SERVE-EVENT with the CLX Interface to X, Using SERVE-EVENT with the CLX Interface to X -\subsection{Without Object Sets} -Since most applications that use CLX, can avoid the complexity of object sets, -these routines are described in a separate section. The routines described in -the next section that use the object set mechanism are based on these -interfaces. - -\begin{defun}{ext:}{enable-clx-event-handling}{% - \args{\var{display} \var{handler}}} - - This function causes \code{system:serve-event} to notice when there - is input on \var{display}'s connection to the X11 server. When this - happens, \code{system:serve-event} invokes \var{handler} on - \var{display} in a dynamic context with an error handler bound that - flushes all events from \var{display} and returns. By returning, - the error handler declines to handle the error, but it will have - cleared all events; thus, entering the debugger will not result in - infinite errors due to streams that wait via - \code{system:serve-event} for input. Calling this repeatedly on the - same \var{display} establishes \var{handler} as a new handler, - replacing any previous one for \var{display}. -\end{defun} - -\begin{defun}{ext:}{disable-clx-event-handling}{\args{\var{display}}} - - This function undoes the effect of - \code{ext:enable-clx-event-handling}. -\end{defun} - -\begin{defmac}{ext:}{with-clx-event-handling}{% - \args{(\var{display} \var{handler}) \mstar{form}}} - - This macro evaluates each \var{form} in a context where - \code{system:serve-event} invokes \var{handler} on \var{display} - whenever there is input on \var{display}'s connection to the X - server. This destroys any previously established handler for - \var{display}. -\end{defmac} - - -%%\node With Object Sets, , Without Object Sets, Using SERVE-EVENT with the CLX Interface to X -\subsection{With Object Sets} -This section discusses the use of object sets and -\code{system:serve-event} to handle CLX events. This is necessary -when a single X application has distinct windows that want to handle -the same events in different ways. Basically, you need some way of -asking for a given window which way you want to handle some event -because this event is handled differently depending on the window. -Object sets provide this feature. - -For each CLX event-key symbol-name \i{XXX} (for example, -\var{key-press}), there is a function \code{serve-}\i{XXX} of two -arguments, an object set and a function. The \code{serve-}\i{XXX} -function establishes the function as the handler for the \kwd{XXX} -event in the object set. Recall from section \ref{object-sets}, -\code{system:add-xwindow-object} associates some Lisp object with a -CLX window in an object set. When \code{system:serve-event} notices -activity on a window, it calls the function given to -\code{ext:enable-clx-event-handling}. If this function is -\code{ext:object-set-event-handler}, it calls the function given to -\code{serve-}\i{XXX}, passing the object given to -\code{system:add-xwindow-object} and the event's slots as well as a -couple other arguments described below. - -To use object sets in this way: -\begin{itemize} - -\item Create an object set. - -\item Define some operations on it using the \code{serve-}\i{XXX} - functions. - -\item Add an object for every window on which you receive requests. - This can be the CLX window itself or some structure more meaningful - to your application. - -\item Call \code{system:serve-event} to service an X event. -\end{itemize} - - -\begin{defun}{ext:}{object-set-event-handler}{% - \args{\var{display}}} - - This function is a suitable argument to - \code{ext:enable-clx-event-handling}. The actual event handlers - defined for particular events within a given object set must take an - argument for every slot in the appropriate event. In addition to - the event slots, \code{ext:object-set-event-handler} passes the - following arguments: - \begin{itemize} - \item The object, as established by - \code{system:add-xwindow-object}, on which the event occurred. - \item event-key, see \code{xlib:event-case}. - \item send-event-p, see \code{xlib:event-case}. - \end{itemize} - - Describing any \code{ext:serve-}\var{event-key-name} function, where - \var{event-key-name} is an event-key symbol-name (for example, - \code{ext:serve-key-press}), indicates exactly what all the - arguments are in their correct order. - -%% \begin{comment} -%% \code{ext:object-set-event-handler} ignores \kwd{no-exposure} -%% events on pixmaps, issuing a warning if one occurs. It is only -%% prepared to dispatch events for windows. -%% \end{comment} - - When creating an object set for use with - \code{ext:object-set-event-handler}, specify - \code{ext:default-clx-event-handler} as the default handler for - events in that object set. If no default handler is specified, and - the system invokes the default default handler, it will cause an - error since this function takes arguments suitable for handling port - messages. -\end{defun} - - -%%\node A SERVE-EVENT Example, , Using SERVE-EVENT with the CLX Interface to X, Event Dispatching with SERVE-EVENT -\section{A SERVE-EVENT Example} -This section contains two examples using \code{system:serve-event}. The first -one does not use object sets, and the second, slightly more complicated one -does. - - -\begin{comment} -* Without Object Sets Example:: -* With Object Sets Example:: -\end{comment} - -%%\node Without Object Sets Example, With Object Sets Example, A SERVE-EVENT Example, A SERVE-EVENT Example -\subsection{Without Object Sets Example} -This example defines an input handler for a CLX display connection. It only -recognizes \kwd{key-press} events. The body of the example loops over -\code{system:serve-event} to get input. - -\begin{lisp} -(in-package "SERVER-EXAMPLE") - -(defun my-input-handler (display) - (xlib:event-case (display :timeout 0) - (:key-press (event-window code state) - (format t "KEY-PRESSED (Window = ~D) = ~S.~%" - (xlib:window-id event-window) - ;; See Hemlock Command Implementor's Manual for convenient - ;; input mapping function. - (ext:translate-character display code state)) - ;; Make XLIB:EVENT-CASE discard the event. - t))) -\end{lisp} -\begin{lisp} -(defun server-example () - "An example of using the SYSTEM:SERVE-EVENT function and object sets to - handle CLX events." - (let* ((display (ext:open-clx-display)) - (screen (display-default-screen display)) - (black (screen-black-pixel screen)) - (white (screen-white-pixel screen)) - (window (create-window :parent (screen-root screen) - :x 0 :y 0 :width 200 :height 200 - :background white :border black - :border-width 2 - :event-mask - (xlib:make-event-mask :key-press)))) - ;; Wrap code in UNWIND-PROTECT, so we clean up after ourselves. - (unwind-protect - (progn - ;; Enable event handling on the display. - (ext:enable-clx-event-handling display #'my-input-handler) - ;; Map the windows to the screen. - (map-window window) - ;; Make sure we send all our requests. - (display-force-output display) - ;; Call serve-event for 100,000 events or immediate timeouts. - (dotimes (i 100000) (system:serve-event))) - ;; Disable event handling on this display. - (ext:disable-clx-event-handling display) - ;; Get rid of the window. - (destroy-window window) - ;; Pick off any events the X server has already queued for our - ;; windows, so we don't choke since SYSTEM:SERVE-EVENT is no longer - ;; prepared to handle events for us. - (loop - (unless (deleting-window-drop-event *display* window) - (return))) - ;; Close the display. - (xlib:close-display display)))) - -(defun deleting-window-drop-event (display win) - "Check for any events on win. If there is one, remove it from the - event queue and return t; otherwise, return nil." - (xlib:display-finish-output display) - (let ((result nil)) - (xlib:process-event - display :timeout 0 - :handler #'(lambda (&key event-window &allow-other-keys) - (if (eq event-window win) - (setf result t) - nil))) - result)) -\end{lisp} - - -%%\node With Object Sets Example, , Without Object Sets Example, A SERVE-EVENT Example -\subsection{With Object Sets Example} -This example involves more work, but you get a little more for your effort. It -defines two objects, \code{input-box} and \code{slider}, and establishes a -\kwd{key-press} handler for each object, \code{key-pressed} and -\code{slider-pressed}. We have two object sets because we handle events on the -windows manifesting these objects differently, but the events come over the -same display connection. - -\begin{lisp} -(in-package "SERVER-EXAMPLE") - -(defstruct (input-box (:print-function print-input-box) - (:constructor make-input-box (display window))) - "Our program knows about input-boxes, and it doesn't care how they - are implemented." - display ; The CLX display on which my input-box is displayed. - window) ; The CLX window in which the user types. -;;; -(defun print-input-box (object stream n) - (declare (ignore n)) - (format stream "#" (input-box-display object))) - -(defvar *input-box-windows* - (system:make-object-set "Input Box Windows" - #'ext:default-clx-event-handler)) - -(defun key-pressed (input-box event-key event-window root child - same-screen-p x y root-x root-y modifiers time - key-code send-event-p) - "This is our :key-press event handler." - (declare (ignore event-key root child same-screen-p x y - root-x root-y time send-event-p)) - (format t "KEY-PRESSED (Window = ~D) = ~S.~%" - (xlib:window-id event-window) - ;; See Hemlock Command Implementor's Manual for convenient - ;; input mapping function. - (ext:translate-character (input-box-display input-box) - key-code modifiers))) -;;; -(ext:serve-key-press *input-box-windows* #'key-pressed) -\end{lisp} -\begin{lisp} -(defstruct (slider (:print-function print-slider) - (:include input-box) - (:constructor %make-slider - (display window window-width max))) - "Our program knows about sliders too, and these provide input values - zero to max." - bits-per-value ; bits per discrete value up to max. - max) ; End value for slider. -;;; -(defun print-slider (object stream n) - (declare (ignore n)) - (format stream "#" - (input-box-display object) - (1- (slider-max object)))) -;;; -(defun make-slider (display window max) - (%make-slider display window - (truncate (xlib:drawable-width window) max) - max)) - -(defvar *slider-windows* - (system:make-object-set "Slider Windows" - #'ext:default-clx-event-handler)) - -(defun slider-pressed (slider event-key event-window root child - same-screen-p x y root-x root-y modifiers time - key-code send-event-p) - "This is our :key-press event handler for sliders. Probably this is - a mouse thing, but for simplicity here we take a character typed." - (declare (ignore event-key root child same-screen-p x y - root-x root-y time send-event-p)) - (format t "KEY-PRESSED (Window = ~D) = ~S --> ~D.~%" - (xlib:window-id event-window) - ;; See Hemlock Command Implementor's Manual for convenient - ;; input mapping function. - (ext:translate-character (input-box-display slider) - key-code modifiers) - (truncate x (slider-bits-per-value slider)))) -;;; -(ext:serve-key-press *slider-windows* #'slider-pressed) -\end{lisp} -\begin{lisp} -(defun server-example () - "An example of using the SYSTEM:SERVE-EVENT function and object sets to - handle CLX events." - (let* ((display (ext:open-clx-display)) - (screen (display-default-screen display)) - (black (screen-black-pixel screen)) - (white (screen-white-pixel screen)) - (iwindow (create-window :parent (screen-root screen) - :x 0 :y 0 :width 200 :height 200 - :background white :border black - :border-width 2 - :event-mask - (xlib:make-event-mask :key-press))) - (swindow (create-window :parent (screen-root screen) - :x 0 :y 300 :width 200 :height 50 - :background white :border black - :border-width 2 - :event-mask - (xlib:make-event-mask :key-press))) - (input-box (make-input-box display iwindow)) - (slider (make-slider display swindow 15))) - ;; Wrap code in UNWIND-PROTECT, so we clean up after ourselves. - (unwind-protect - (progn - ;; Enable event handling on the display. - (ext:enable-clx-event-handling display - #'ext:object-set-event-handler) - ;; Add the windows to the appropriate object sets. - (system:add-xwindow-object iwindow input-box - *input-box-windows*) - (system:add-xwindow-object swindow slider - *slider-windows*) - ;; Map the windows to the screen. - (map-window iwindow) - (map-window swindow) - ;; Make sure we send all our requests. - (display-force-output display) - ;; Call server for 100,000 events or immediate timeouts. - (dotimes (i 100000) (system:serve-event))) - ;; Disable event handling on this display. - (ext:disable-clx-event-handling display) - (delete-window iwindow display) - (delete-window swindow display) - ;; Close the display. - (xlib:close-display display)))) -\end{lisp} -\begin{lisp} -(defun delete-window (window display) - ;; Remove the windows from the object sets before destroying them. - (system:remove-xwindow-object window) - ;; Destroy the window. - (destroy-window window) - ;; Pick off any events the X server has already queued for our - ;; windows, so we don't choke since SYSTEM:SERVE-EVENT is no longer - ;; prepared to handle events for us. - (loop - (unless (deleting-window-drop-event display window) - (return)))) - -(defun deleting-window-drop-event (display win) - "Check for any events on win. If there is one, remove it from the - event queue and return t; otherwise, return nil." - (xlib:display-finish-output display) - (let ((result nil)) - (xlib:process-event - display :timeout 0 - :handler #'(lambda (&key event-window &allow-other-keys) - (if (eq event-window win) - (setf result t) - nil))) - result)) -\end{lisp} - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/alien.ms} - -%%\node Alien Objects, Interprocess Communication under LISP, Event Dispatching with SERVE-EVENT, Top -\chapter{Alien Objects} -\label{aliens} -\begin{center} -\b{By Robert MacLachlan and William Lott} -\end{center} -\vspace{1 cm} - -\begin{comment} -* Introduction to Aliens:: -* Alien Types:: -* Alien Operations:: -* Alien Variables:: -* Alien Data Structure Example:: -* Loading Unix Object Files:: -* Alien Function Calls:: -* Step-by-Step Alien Example:: -\end{comment} - -%%\node Introduction to Aliens, Alien Types, Alien Objects, Alien Objects -\section{Introduction to Aliens} - -Because of Lisp's emphasis on dynamic memory allocation and garbage -collection, Lisp implementations use unconventional memory representations -for objects. This representation mismatch creates problems when a Lisp -program must share objects with programs written in another language. There -are three different approaches to establishing communication: -\begin{itemize} -\item The burden can be placed on the foreign program (and programmer) by -requiring the use of Lisp object representations. The main difficulty with -this approach is that either the foreign program must be written with Lisp -interaction in mind, or a substantial amount of foreign ``glue'' code must be -written to perform the translation. - -\item The Lisp system can automatically convert objects back and forth -between the Lisp and foreign representations. This is convenient, but -translation becomes prohibitively slow when large or complex data structures -must be shared. - -\item The Lisp program can directly manipulate foreign objects through the -use of extensions to the Lisp language. Most Lisp systems make use of -this approach, but the language for describing types and expressing -accesses is often not powerful enough for complex objects to be easily -manipulated. -\end{itemize} -\cmucl{} relies primarily on the automatic conversion and direct manipulation -approaches: Aliens of simple scalar types are automatically converted, -while complex types are directly manipulated in their foreign -representation. Any foreign objects that can't automatically be -converted into Lisp values are represented by objects of type -\code{alien-value}. Since Lisp is a dynamically typed language, even -foreign objects must have a run-time type; this type information is -provided by encapsulating the raw pointer to the foreign data within an -\code{alien-value} object. - -The Alien type language and operations are most similar to those of the -C language, but Aliens can also be used when communicating with most -other languages that can be linked with C. - -%% -%%\node Alien Types, Alien Operations, Introduction to Aliens, Alien Objects -\section{Alien Types} - -Alien types have a description language based on nested list structure. For -example: -\begin{example} -struct foo \{ - int a; - struct foo *b[100]; -\}; -\end{example} -has the corresponding Alien type: -\begin{lisp} -(struct foo - (a int) - (b (array (* (struct foo)) 100))) -\end{lisp} - - -\begin{comment} -* Defining Alien Types:: -* Alien Types and Lisp Types:: -* Alien Type Specifiers:: -* The C-Call Package:: -\end{comment} - -%%\node Defining Alien Types, Alien Types and Lisp Types, Alien Types, Alien Types -\subsection{Defining Alien Types} - -Types may be either named or anonymous. With structure and union -types, the name is part of the type specifier, allowing recursively -defined types such as: -\begin{lisp} -(struct foo (a (* (struct foo)))) -\end{lisp} -An anonymous structure or union type is specified by using the name -\nil. The \funref{with-alien} macro defines a local scope which -``captures'' any named type definitions. Other types are not -inherently named, but can be given named abbreviations using -\code{def-alien-type}. - -\begin{defmac}{alien:}{def-alien-type}{name type} - - This macro globally defines \var{name} as a shorthand for the Alien - type \var{type}. When introducing global structure and union type - definitions, \var{name} may be \nil, in which case the name to - define is taken from the type's name. -\end{defmac} - - -%%\node Alien Types and Lisp Types, Alien Type Specifiers, Defining Alien Types, Alien Types -\subsection{Alien Types and Lisp Types} - -The Alien types form a subsystem of the \cmucl{} type system. An -\code{alien} type specifier provides a way to use any Alien type as a -Lisp type specifier. For example -\begin{lisp} -(typep foo '(alien (* int))) -\end{lisp} -can be used to determine whether \code{foo} is a pointer to an -\code{int}. \code{alien} type specifiers can be used in the same ways -as ordinary type specifiers (like \code{string}.) Alien type -declarations are subject to the same precise type checking as any -other declaration (section \xlref{precise-type-checks}.) - -Note that the Alien type system overlaps with normal Lisp type -specifiers in some cases. For example, the type specifier -\code{(alien single-float)} is identical to \code{single-float}, since -Alien floats are automatically converted to Lisp floats. When -\code{type-of} is called on an Alien value that is not automatically -converted to a Lisp value, then it will return an \code{alien} type -specifier. - -%%\node Alien Type Specifiers, The C-Call Package, Alien Types and Lisp Types, Alien Types -\subsection{Alien Type Specifiers} - -Some Alien type names are \clisp symbols, but the names are -still exported from the \code{alien} package, so it is legal to say -\code{alien:single-float}. These are the basic Alien type specifiers: - -\begin{deftp}{Alien type}{*}{% - \args{\var{type}}} - - A pointer to an object of the specified \var{type}. If \var{type} - is \true, then it means a pointer to anything, similar to - ``\code{void *}'' in ANSI C. Currently, the only way to detect a - null pointer is: -\begin{lisp} - (zerop (sap-int (alien-sap \var{ptr}))) -\end{lisp} -\xlref{system-area-pointers} -\end{deftp} - -\begin{deftp}{Alien type}{array}{\var{type} \mstar{\var{dimension}}} - - An array of the specified \var{dimensions}, holding elements of type - \var{type}. Note that \code{(* int)} and \code{(array int)} are - considered to be different types when type checking is done; pointer - and array types must be explicitly coerced using \code{cast}. - - Arrays are accessed using \code{deref}, passing the indices as - additional arguments. Elements are stored in column-major order (as - in C), so the first dimension determines only the size of the memory - block, and not the layout of the higher dimensions. An array whose - first dimension is variable may be specified by using \nil{} as the - first dimension. Fixed-size arrays can be allocated as array - elements, structure slots or \code{with-alien} variables. Dynamic - arrays can only be allocated using \funref{make-alien}. -\end{deftp} - -\begin{deftp}{Alien type}{struct}{\var{name} - \mstar{(\var{field} \var{type} \mopt{\var{bits}})}} - - A structure type with the specified \var{name} and \var{fields}. - Fields are allocated at the same positions used by the - implementation's C compiler. \var{bits} is intended for C-like bit - field support, but is currently unused. If \var{name} is \false, - then the type is anonymous. - - If a named Alien \code{struct} specifier is passed to - \funref{def-alien-type} or \funref{with-alien}, then this defines, - respectively, a new global or local Alien structure type. If no - \var{fields} are specified, then the fields are taken from the - current (local or global) Alien structure type definition of - \var{name}. -\end{deftp} - -\begin{deftp}{Alien type}{union}{\var{name} - \mstar{(\var{field} \var{type} \mopt{\var{bits}})}} - - Similar to \code{struct}, but defines a union type. All fields are - allocated at the same offset, and the size of the union is the size - of the largest field. The programmer must determine which field is - active from context. -\end{deftp} - -\begin{deftp}{Alien type}{enum}{\var{name} \mstar{\var{spec}}} - - An enumeration type that maps between integer values and keywords. - If \var{name} is \false, then the type is anonymous. Each - \var{spec} is either a keyword, or a list \code{(\var{keyword} - \var{value})}. If \var{integer} is not supplied, then it defaults - to one greater than the value for the preceding spec (or to zero if - it is the first spec.) -\end{deftp} - -\begin{deftp}{Alien type}{signed}{\mopt{\var{bits}}} - A signed integer with the specified number of bits precision. The - upper limit on integer precision is determined by the machine's word - size. If no size is specified, the maximum size will be used. -\end{deftp} - -\begin{deftp}{Alien type}{integer}{\mopt{\var{bits}}} - Identical to \code{signed}---the distinction between \code{signed} - and \code{integer} is purely stylistic. -\end{deftp} - -\begin{deftp}{Alien type}{unsigned}{\mopt{\var{bits}}} - Like \code{signed}, but specifies an unsigned integer. -\end{deftp} - -\begin{deftp}{Alien type}{boolean}{\mopt{\var{bits}}} - Similar to an enumeration type that maps \code{0} to \false{} and - all other values to \true. \var{bits} determines the amount of - storage allocated to hold the truth value. -\end{deftp} - -\begin{deftp}{Alien type}{single-float}{} - A floating-point number in IEEE single format. -\end{deftp} - -\begin{deftp}{Alien type}{double-float}{} - A floating-point number in IEEE double format. -\end{deftp} - -\begin{deftp}{Alien type}{function}{\var{result-type} \mstar{\var{arg-type}}} - \label{alien-function-types} - A Alien function that takes arguments of the specified - \var{arg-types} and returns a result of type \var{result-type}. - Note that the only context where a \code{function} type is directly - specified is in the argument to \code{alien-funcall} (see section - \funref{alien-funcall}.) In all other contexts, functions are - represented by function pointer types: \code{(* (function ...))}. -\end{deftp} - -\begin{deftp}{Alien type}{system-area-pointer}{} - A pointer which is represented in Lisp as a - \code{system-area-pointer} object (\pxlref{system-area-pointers}.) -\end{deftp} - -%%\node The C-Call Package, , Alien Type Specifiers, Alien Types -\subsection{The C-Call Package} - -The \code{c-call} package exports these type-equivalents to the C type -of the same name: \code{char}, \code{short}, \code{int}, \code{long}, -\code{unsigned-char}, \code{unsigned-short}, \code{unsigned-int}, -\code{unsigned-long}, \code{float}, \code{double}. \code{c-call} also -exports these types: - -\begin{deftp}{Alien type}{void}{} - This type is used in function types to declare that no useful value - is returned. Evaluation of an \code{alien-funcall} form will return - zero values. -\end{deftp} - -\begin{deftp}{Alien type}{c-string}{} - This type is similar to \code{(* char)}, but is interpreted as a - null-terminated string, and is automatically converted into a Lisp - string when accessed. If the pointer is C \code{NULL} (or 0), then - accessing gives Lisp \false. - - Assigning a Lisp string to a \code{c-string} structure field or - variable stores the contents of the string to the memory already - pointed to by that variable. When an Alien of type \code{(* char)} - is assigned to a \code{c-string}, then the \code{c-string} pointer - is assigned to. This allows \code{c-string} pointers to be - initialized. For example: -\begin{lisp} - (def-alien-type nil (struct foo (str c-string))) - - (defun make-foo (str) (let ((my-foo (make-alien (struct foo)))) - (setf (slot my-foo 'str) (make-alien char (length str))) (setf (slot - my-foo 'str) str) my-foo)) -\end{lisp} -Storing Lisp \false{} writes C \code{NULL} to the \code{c-string} -pointer. -\end{deftp} - -%% -%%\node Alien Operations, Alien Variables, Alien Types, Alien Objects -\section{Alien Operations} - -This section describes the basic operations on Alien values. - -\begin{comment} -* Alien Access Operations:: -* Alien Coercion Operations:: -* Alien Dynamic Allocation:: -\end{comment} - -%%\node Alien Access Operations, Alien Coercion Operations, Alien Operations, Alien Operations -\subsection{Alien Access Operations} - -\begin{defun}{alien:}{deref}{\args{\var{pointer-or-array} \amprest \var{indices}}} - - This function returns the value pointed to by an Alien pointer or - the value of an Alien array element. If a pointer, an optional - single index can be specified to give the equivalent of C pointer - arithmetic; this index is scaled by the size of the type pointed to. - If an array, the number of indices must be the same as the number of - dimensions in the array type. \code{deref} can be set with - \code{setf} to assign a new value. -\end{defun} - -\begin{defun}{alien:}{slot}{\args{\var{struct-or-union} \var{slot-name}}} - - This function extracts the value of slot \var{slot-name} from the an - Alien \code{struct} or \code{union}. If \var{struct-or-union} is a - pointer to a structure or union, then it is automatically - dereferenced. This can be set with \code{setf} to assign a new - value. Note that \var{slot-name} is evaluated, and need not be a - compile-time constant (but only constant slot accesses are - efficiently compiled.) -\end{defun} - -%%\node Alien Coercion Operations, Alien Dynamic Allocation, Alien Access Operations, Alien Operations -\subsection{Alien Coercion Operations} - -\begin{defmac}{alien:}{addr}{\var{alien-expr}} - - This macro returns a pointer to the location specified by - \var{alien-expr}, which must be either an Alien variable, a use of - \code{deref}, a use of \code{slot}, or a use of - \funref{extern-alien}. -\end{defmac} - -\begin{defmac}{alien:}{cast}{\var{alien} \var{new-type}} - - This macro converts \var{alien} to a new Alien with the specified - \var{new-type}. Both types must be an Alien pointer, array or - function type. Note that the result is not \code{eq} to the - argument, but does refer to the same data bits. -\end{defmac} - -\begin{defmac}{alien:}{sap-alien}{\var{sap} \var{type}} - \defunx[alien:]{alien-sap}{\var{alien-value}} - - \code{sap-alien} converts \var{sap} (a system area pointer - \pxlref{system-area-pointers}) to an Alien value with the specified - \var{type}. \var{type} is not evaluated. - -\code{alien-sap} returns the SAP which points to \var{alien-value}'s -data. - -The \var{type} to \code{sap-alien} and the type of the \var{alien-value} to -\code{alien-sap} must some Alien pointer, array or record type. -\end{defmac} - -%%\node Alien Dynamic Allocation, , Alien Coercion Operations, Alien Operations -\subsection{Alien Dynamic Allocation} - -Dynamic Aliens are allocated using the \code{malloc} library, so foreign code -can call \code{free} on the result of \code{make-alien}, and Lisp code can -call \code{free-alien} on objects allocated by foreign code. - -\begin{defmac}{alien:}{make-alien}{\var{type} \mopt{\var{size}}} - - This macro returns a dynamically allocated Alien of the specified - \var{type} (which is not evaluated.) The allocated memory is not - initialized, and may contain arbitrary junk. If supplied, - \var{size} is an expression to evaluate to compute the size of the - allocated object. There are two major cases: - \begin{itemize} - \item When \var{type} is an array type, an array of that type is - allocated and a \var{pointer} to it is returned. Note that you - must use \code{deref} to change the result to an array before you - can use \code{deref} to read or write elements: - \begin{lisp} - (defvar *foo* (make-alien (array char 10))) - - (type-of *foo*) \result{} (alien (* (array (signed 8) 10))) - - (setf (deref (deref foo) 0) 10) \result{} 10 - \end{lisp} - If supplied, \var{size} is used as the first dimension for the - array. - - \item When \var{type} is any other type, then then an object for - that type is allocated, and a \var{pointer} to it is returned. So - \code{(make-alien int)} returns a \code{(* int)}. If \var{size} - is specified, then a block of that many objects is allocated, with - the result pointing to the first one. - \end{itemize} -\end{defmac} - -\begin{defun}{alien:}{free-alien}{\var{alien}} - - This function frees the storage for \var{alien} (which must have - been allocated with \code{make-alien} or \code{malloc}.) -\end{defun} - -See also \funref{with-alien}, which stack-allocates Aliens. - -%% -%%\node Alien Variables, Alien Data Structure Example, Alien Operations, Alien Objects -\section{Alien Variables} - -Both local (stack allocated) and external (C global) Alien variables are -supported. - -\begin{comment} -* Local Alien Variables:: -* External Alien Variables:: -\end{comment} - -%%\node Local Alien Variables, External Alien Variables, Alien Variables, Alien Variables -\subsection{Local Alien Variables} - -\begin{defmac}{alien:}{with-alien}{\mstar{(\var{name} \var{type} - \mopt{\var{initial-value}})} \mstar{form}} - - This macro establishes local alien variables with the specified - Alien types and names for dynamic extent of the body. The variable - \var{names} are established as symbol-macros; the bindings have - lexical scope, and may be assigned with \code{setq} or \code{setf}. - This form is analogous to defining a local variable in C: additional - storage is allocated, and the initial value is copied. - - \code{with-alien} also establishes a new scope for named structures - and unions. Any \var{type} specified for a variable may contain - name structure or union types with the slots specified. Within the - lexical scope of the binding specifiers and body, a locally defined - structure type \var{foo} can be referenced by its name using: -\begin{lisp} - (struct foo) -\end{lisp} -\end{defmac} - -%%\node External Alien Variables, , Local Alien Variables, Alien Variables -\subsection{External Alien Variables} -\label{external-aliens} - -External Alien names are strings, and Lisp names are symbols. When an -external Alien is represented using a Lisp variable, there must be a -way to convert from one name syntax into the other. The macros -\code{extern-alien}, \code{def-alien-variable} and -\funref{def-alien-routine} use this conversion heuristic: -\begin{itemize} -\item Alien names are converted to Lisp names by uppercasing and - replacing underscores with hyphens. - -\item Conversely, Lisp names are converted to Alien names by - lowercasing and replacing hyphens with underscores. - -\item Both the Lisp symbol and Alien string names may be separately - specified by using a list of the form: -\begin{lisp} - (\var{alien-string} \var{lisp-symbol}) -\end{lisp} -\end{itemize} - -\begin{defmac}{alien:}{def-alien-variable}{\var{name} \var{type}} - - This macro defines \var{name} as an external Alien variable of the - specified Alien \var{type}. \var{name} and \var{type} are not - evaluated. The Lisp name of the variable (see above) becomes a - global Alien variable in the Lisp namespace. Global Alien variables - are effectively ``global symbol macros''; a reference to the - variable fetches the contents of the external variable. Similarly, - setting the variable stores new contents---the new contents must be - of the declared \var{type}. - - For example, it is often necessary to read the global C variable - \code{errno} to determine why a particular function call failed. It - is possible to define errno and make it accessible from Lisp by the - following: -\begin{lisp} -(def-alien-variable "errno" int) - -;; Now it is possible to get the value of the C variable errno simply by -;; referencing that Lisp variable: -;; -(print errno) -\end{lisp} -\end{defmac} - -\begin{defmac}{alien:}{extern-alien}{\var{name} \var{type}} - - This macro returns an Alien with the specified \var{type} which - points to an externally defined value. \var{name} is not evaluated, - and may be specified either as a string or a symbol. \var{type} is - an unevaluated Alien type specifier. -\end{defmac} - -%% -%%\node Alien Data Structure Example, Loading Unix Object Files, Alien Variables, Alien Objects -\section{Alien Data Structure Example} - -Now that we have Alien types, operations and variables, we can manipulate -foreign data structures. This C declaration can be translated into the -following Alien type: -\begin{lisp} -struct foo \{ - int a; - struct foo *b[100]; -\}; - - \myequiv - -(def-alien-type nil - (struct foo - (a int) - (b (array (* (struct foo)) 100)))) -\end{lisp} - -With this definition, the following C expression can be translated in this way: -\begin{example} -struct foo f; -f.b[7].a - - \myequiv - -(with-alien ((f (struct foo))) - (slot (deref (slot f 'b) 7) 'a) - ;; - ;; Do something with f... - ) -\end{example} - - -Or consider this example of an external C variable and some accesses: -\begin{example} -struct c_struct \{ - short x, y; - char a, b; - int z; - c_struct *n; -\}; - -extern struct c_struct *my_struct; - -my_struct->x++; -my_struct->a = 5; -my_struct = my_struct->n; -\end{example} -which can be made be manipulated in Lisp like this: -\begin{lisp} -(def-alien-type nil - (struct c-struct - (x short) - (y short) - (a char) - (b char) - (z int) - (n (* c-struct)))) - -(def-alien-variable "my_struct" (* c-struct)) - -(incf (slot my-struct 'x)) -(setf (slot my-struct 'a) 5) -(setq my-struct (slot my-struct 'n)) -\end{lisp} - - -%% -%%\node Loading Unix Object Files, Alien Function Calls, Alien Data Structure Example, Alien Objects -\section{Loading Unix Object Files} - -Foreign object files are loaded into the running Lisp process by -\code{load-foreign}. First, it runs the linker on the files and -libraries, creating an absolute Unix object file. This object file is -then loaded into into the currently running Lisp. The external -symbols defining routines and variables are made available for future -external references (e.g. by \code{extern-alien}.) -\code{load-foreign} must be run before any of the defined symbols are -referenced. - -Note that if a Lisp core image is saved (using \funref{save-lisp}), all -loaded foreign code is lost when the image is restarted. - -\begin{defun}{alien:}{load-foreign}{% - \args{\var{files} \keys{\kwd{libraries} \kwd{base-file} \kwd{env}}}} - - \var{files} is a \code{simple-string} or list of - \code{simple-string}s specifying the names of the object files. - \var{libraries} is a list of \code{simple-string}s specifying - libraries in a format that \code{ld}, the Unix linker, expects. The - default value for \var{libraries} is \code{("-lc")} (i.e., the - standard C library). \var{base-file} is the file to use for the - initial symbol table information. The default is the Lisp start up - code: \file{path:lisp}. \var{env} should be a list of simple - strings in the format of Unix environment variables (i.e., - \code{\var{A}=\var{B}}, where \var{A} is an environment variable and - \var{B} is its value). The default value for \var{env} is the - environment information available at the time Lisp was invoked. - Unless you are certain that you want to change this, you should just - use the default. -\end{defun} - -%% -%%\node Alien Function Calls, Step-by-Step Alien Example, Loading Unix Object Files, Alien Objects -\section{Alien Function Calls} - -The foreign function call interface allows a Lisp program to call functions -written in other languages. The current implementation of the foreign -function call interface assumes a C calling convention and thus routines -written in any language that adheres to this convention may be called from -Lisp. - -Lisp sets up various interrupt handling routines and other environment -information when it first starts up, and expects these to be in place at all -times. The C functions called by Lisp should either not change the -environment, especially the interrupt entry points, or should make sure -that these entry points are restored when the C function returns to Lisp. -If a C function makes changes without restoring things to the way they were -when the C function was entered, there is no telling what will happen. - -\begin{comment} -* alien-funcall:: The alien-funcall Primitive -* def-alien-routine:: The def-alien-routine Macro -* def-alien-routine Example:: -* Calling Lisp from C:: -\end{comment} - -%%\node alien-funcall, def-alien-routine, Alien Function Calls, Alien Function Calls -\subsection{The alien-funcall Primitive} - -\begin{defun}{alien:}{alien-funcall}{% - \args{\var{alien-function} \amprest{} \var{arguments}}} - - This function is the foreign function call primitive: - \var{alien-function} is called with the supplied \var{arguments} and - its value is returned. The \var{alien-function} is an arbitrary - run-time expression; to call a constant function, use - \funref{extern-alien} or \code{def-alien-routine}. - - The type of \var{alien-function} must be \code{(alien (function - ...))} or \code{(alien (* (function ...)))}, - \xlref{alien-function-types}. The function type is used to - determine how to call the function (as though it was declared with - a prototype.) The type need not be known at compile time, but only - known-type calls are efficiently compiled. Limitations: - \begin{itemize} - \item Structure type return values are not implemented. - \item Passing of structures by value is not implemented. - \end{itemize} -\end{defun} - -Here is an example which allocates a \code{(struct foo)}, calls a foreign -function to initialize it, then returns a Lisp vector of all the -\code{(* (struct foo))} objects filled in by the foreign call: -\begin{lisp} -;; -;; Allocate a foo on the stack. -(with-alien ((f (struct foo))) - ;; - ;; Call some C function to fill in foo fields. - (alien-funcall (extern-alien "mangle_foo" (function void (* foo))) - (addr f)) - ;; - ;; Find how many foos to use by getting the A field. - (let* ((num (slot f 'a)) - (result (make-array num))) - ;; - ;; Get a pointer to the array so that we don't have to keep extracting it: - (with-alien ((a (* (array (* (struct foo)) 100)) (addr (slot f 'b)))) - ;; - ;; Loop over the first N elements and stash them in the result vector. - (dotimes (i num) - (setf (svref result i) (deref (deref a) i))) - result))) -\end{lisp} - -%%\node def-alien-routine, def-alien-routine Example, alien-funcall, Alien Function Calls -\subsection{The def-alien-routine Macro} - - -\begin{defmac}{alien:}{def-alien-routine}{\var{name} \var{result-type} - \mstar{(\var{aname} \var{atype} \mopt{style})}} - - This macro is a convenience for automatically generating Lisp - interfaces to simple foreign functions. The primary feature is the - parameter style specification, which translates the C - pass-by-reference idiom into additional return values. - - \var{name} is usually a string external symbol, but may also be a - symbol Lisp name or a list of the foreign name and the Lisp name. - If only one name is specified, the other is automatically derived, - (\pxlref{external-aliens}.) - - \var{result-type} is the Alien type of the return value. Each - remaining subform specifies an argument to the foreign function. - \var{aname} is the symbol name of the argument to the constructed - function (for documentation) and \var{atype} is the Alien type of - corresponding foreign argument. The semantics of the actual call - are the same as for \funref{alien-funcall}. \var{style} should be - one of the following: - \begin{Lentry} - \item[\kwd{in}] specifies that the argument is passed by value. - This is the default. \kwd{in} arguments have no corresponding - return value from the Lisp function. - - \item[\kwd{out}] specifies a pass-by-reference output value. The - type of the argument must be a pointer to a fixed sized object - (such as an integer or pointer). \kwd{out} and \kwd{in-out} - cannot be used with pointers to arrays, records or functions. An - object of the correct size is allocated, and its address is passed - to the foreign function. When the function returns, the contents - of this location are returned as one of the values of the Lisp - function. - - \item[\kwd{copy}] is similar to \kwd{in}, but the argument is copied - to a pre-allocated object and a pointer to this object is passed - to the foreign routine. - - \item[\kwd{in-out}] is a combination of \kwd{copy} and \kwd{out}. - The argument is copied to a pre-allocated object and a pointer to - this object is passed to the foreign routine. On return, the - contents of this location is returned as an additional value. - \end{Lentry} - Any efficiency-critical foreign interface function should be inline - expanded by preceding \code{def-alien-routine} with: - \begin{lisp} - (declaim (inline \var{lisp-name})) - \end{lisp} - In addition to avoiding the Lisp call overhead, this allows - pointers, word-integers and floats to be passed using non-descriptor - representations, avoiding consing (\pxlref{non-descriptor}.) -\end{defmac} - -%%\node def-alien-routine Example, Calling Lisp from C, def-alien-routine, Alien Function Calls -\subsection{def-alien-routine Example} - -Consider the C function \code{cfoo} with the following calling convention: -\begin{example} -cfoo (str, a, i) - char *str; - char *a; /* update */ - int *i; /* out */ -\{ -/* Body of cfoo. */ -\} -\end{example} -which can be described by the following call to \code{def-alien-routine}: -\begin{lisp} -(def-alien-routine "cfoo" void - (str c-string) - (a char :in-out) - (i int :out)) -\end{lisp} -The Lisp function \code{cfoo} will have two arguments (\var{str} and \var{a}) -and two return values (\var{a} and \var{i}). - -%%\node Calling Lisp from C, , def-alien-routine Example, Alien Function Calls -\subsection{Calling Lisp from C} - -Calling Lisp functions from C is sometimes possible, but is rather hackish. -See \code{funcall0} ... \code{funcall3} in the \file{lisp/arch.h}. The -arguments must be valid CMU CL object descriptors (e.g. fixnums must be -left-shifted by 2.) See \file{compiler/generic/objdef.lisp} or the derived -file \file{lisp/internals.h} for details of the object representation. -\file{lisp/internals.h} is mechanically generated, and is not part of the -source distribution. It is distributed in the \file{docs/} directory of the -binary distribution. - -Note that the garbage collector moves objects, and won't be able to fix up any -references in C variables, so either turn GC off or don't keep Lisp pointers -in C data unless they are to statically allocated objects. You can use -\funref{purify} to place live data structures in static space so that they -won't move during GC. - -\begin{changebar} -\subsection{Accessing Lisp Arrays} - -Due to the way \cmucl{} manages memory, the amount of memory that can -be dynamically allocated by \code{malloc} or \funref{make-alien} is -limited\footnote{\cmucl{} mmaps a large piece of memory for it's own - use and this memory is typically about 8 MB above the start of the C - heap. Thus, only about 8 MB of memory can be dynamically - allocated.}. - -To overcome this limitation, it is possible to access the content of -Lisp arrays which are limited only by the amount of physical memory -and swap space available. However, this technique is only useful if -the foreign function takes pointers to memory instead of allocating -memory for itself. In latter case, you will have to modify the -foreign functions. - -This technique takes advantage of the fact that \cmucl{} has -specialized array types (\pxlref{specialized-array-types}) that match -a typical C array. For example, a \code{(simple-array double-float - (100))} is stored in memory in essentially the same way as the C -array \code{double x[100]} would be. The following function allows us -to get the physical address of such a Lisp array: -\begin{example} -(defun array-data-address (array) - "Return the physical address of where the actual data of an array is -stored. - -ARRAY must be a specialized array type in CMU Lisp. This means ARRAY -must be an array of one of the following types: - - double-float - single-float - (unsigned-byte 32) - (unsigned-byte 16) - (unsigned-byte 8) - (signed-byte 32) - (signed-byte 16) - (signed-byte 8) -" - (declare (type (or #+signed-array (array (signed-byte 8)) - #+signed-array (array (signed-byte 16)) - #+signed-array (array (signed-byte 32)) - (array (unsigned-byte 8)) - (array (unsigned-byte 16)) - (array (unsigned-byte 32)) - (array single-float) - (array double-float)) - array) - (optimize (speed 3) (safety 0)) - (ext:optimize-interface (safety 3))) - ;; WITH-ARRAY-DATA will get us to the actual data. However, because - ;; the array could have been displaced, we need to know where the - ;; data starts. - (lisp::with-array-data ((data array) - (start) - (end)) - (declare (ignore end)) - ;; DATA is a specialized simple-array. Memory is laid out like this: - ;; - ;; byte offset Value - ;; 0 type code (should be 70 for double-float vector) - ;; 4 4 * number of elements in vector - ;; 8 1st element of vector - ;; ... ... - ;; - (let ((addr (+ 8 (logandc1 7 (kernel:get-lisp-obj-address data)))) - (type-size (let ((type (array-element-type data))) - (cond ((or (equal type '(signed-byte 8)) - (equal type '(unsigned-byte 8))) - 1) - ((or (equal type '(signed-byte 16)) - (equal type '(unsigned-byte 16))) - 2) - ((or (equal type '(signed-byte 32)) - (equal type '(unsigned-byte 32))) - 4) - ((equal type 'single-float) - 4) - ((equal type 'double-float) - 8) - (t - (error "Unknown specialized array element type")))))) - (declare (type (unsigned-byte 32) addr) - (optimize (speed 3) (safety 0) (ext:inhibit-warnings 3))) - (system:int-sap (the (unsigned-byte 32) - (+ addr (* type-size start))))))) -\end{example} - -Assume we have the C function below that we wish to use: -\begin{example} - double dotprod(double* x, double* y, int n) - \{ - int k; - double sum = 0; - - for (k = 0; k < n; ++k) \{ - sum += x[k] * y[k]; - \} - \} -\end{example} -The following example generates two large arrays in Lisp, and calls the C -function to do the desired computation. This would not have been -possible using \code{malloc} or \code{make-alien} since we need about -16 MB of memory to hold the two arrays. -\begin{example} - (def-alien-routine "dotprod" double - (x (* double-float) :in) - (y (* double-float) :in) - (n int :in)) - - (let ((x (make-array 1000000 :element-type 'double-float)) - (y (make-array 1000000 :element-type 'double-float))) - ;; Initialize X and Y somehow - (let ((x-addr (system:int-sap (array-data-address x))) - (y-addr (system:int-sap (array-data-address y)))) - (dotprod x-addr y-addr 1000000))) -\end{example} -In this example, it may be useful to wrap the inner \code{let} -expression in an \code{unwind-protect} that first turns off garbage -collection and then turns garbage collection on afterwards. This will -prevent garbage collection from moving \code{x} and \code{y} after we -have obtained the (now erroneous) addresses but before the call to -\code{dotprod} is made. - -\end{changebar} -%% -%%\node Step-by-Step Alien Example, , Alien Function Calls, Alien Objects -\section{Step-by-Step Alien Example} - -This section presents a complete example of an interface to a somewhat -complicated C function. This example should give a fairly good idea -of how to get the effect you want for almost any kind of C function. -Suppose you have the following C function which you want to be able to -call from Lisp in the file \file{test.c}: -\begin{verbatim} -struct c_struct -{ - int x; - char *s; -}; - -struct c_struct *c_function (i, s, r, a) - int i; - char *s; - struct c_struct *r; - int a[10]; -{ - int j; - struct c_struct *r2; - - printf("i = %d\n", i); - printf("s = %s\n", s); - printf("r->x = %d\n", r->x); - printf("r->s = %s\n", r->s); - for (j = 0; j < 10; j++) printf("a[%d] = %d.\n", j, a[j]); - r2 = (struct c_struct *) malloc (sizeof(struct c_struct)); - r2->x = i + 5; - r2->s = "A C string"; - return(r2); -}; -\end{verbatim} -It is possible to call this function from Lisp using the file \file{test.lisp} -whose contents is: -\begin{lisp} -;;; -*- Package: test-c-call -*- -(in-package "TEST-C-CALL") -(use-package "ALIEN") -(use-package "C-CALL") - -;;; Define the record c-struct in Lisp. -(def-alien-type nil - (struct c-struct - (x int) - (s c-string))) - -;;; Define the Lisp function interface to the C routine. It returns a -;;; pointer to a record of type c-struct. It accepts four parameters: -;;; i, an int; s, a pointer to a string; r, a pointer to a c-struct -;;; record; and a, a pointer to the array of 10 ints. -;;; -;;; The INLINE declaration eliminates some efficiency notes about heap -;;; allocation of Alien values. -(declaim (inline c-function)) -(def-alien-routine c-function - (* (struct c-struct)) - (i int) - (s c-string) - (r (* (struct c-struct))) - (a (array int 10))) - -;;; A function which sets up the parameters to the C function and -;;; actually calls it. -(defun call-cfun () - (with-alien ((ar (array int 10)) - (c-struct (struct c-struct))) - (dotimes (i 10) ; Fill array. - (setf (deref ar i) i)) - (setf (slot c-struct 'x) 20) - (setf (slot c-struct 's) "A Lisp String") - - (with-alien ((res (* (struct c-struct)) - (c-function 5 "Another Lisp String" (addr c-struct) ar))) - (format t "Returned from C function.~%") - (multiple-value-prog1 - (values (slot res 'x) - (slot res 's)) - ;; - ;; Deallocate result \i{after} we are done using it. - (free-alien res))))) -\end{lisp} -To execute the above example, it is necessary to compile the C routine as -follows: -\begin{example} -cc -c test.c -\end{example} -In order to enable incremental loading with some linkers, you may need to say: -\begin{example} -cc -G 0 -c test.c -\end{example} -Once the C code has been compiled, you can start up Lisp and load it in: -\begin{example} -%lisp -;;; Lisp should start up with its normal prompt. - -;;; Compile the Lisp file. This step can be done separately. You don't have -;;; to recompile every time. -* (compile-file "test.lisp") - -;;; Load the foreign object file to define the necessary symbols. This must -;;; be done before loading any code that refers to these symbols. next block -;;; of comments are actually the output of LOAD-FOREIGN. Different linkers -;;; will give different warnings, but some warning about redefining the code -;;; size is typical. -* (load-foreign "test.o") - -;;; Running library:load-foreign.csh... -;;; Loading object file... -;;; Parsing symbol table... -Warning: "_gp" moved from #x00C082C0 to #x00C08460. - -Warning: "end" moved from #x00C00340 to #x00C004E0. - -;;; o.k. now load the compiled Lisp object file. -* (load "test") - -;;; Now we can call the routine that sets up the parameters and calls the C -;;; function. -* (test-c-call::call-cfun) - -;;; The C routine prints the following information to standard output. -i = 5 -s = Another Lisp string -r->x = 20 -r->s = A Lisp string -a[0] = 0. -a[1] = 1. -a[2] = 2. -a[3] = 3. -a[4] = 4. -a[5] = 5. -a[6] = 6. -a[7] = 7. -a[8] = 8. -a[9] = 9. -;;; Lisp prints out the following information. -Returned from C function. -;;; Return values from the call to test-c-call::call-cfun. -10 -"A C string" -* -\end{example} - -If any of the foreign functions do output, they should not be called from -within Hemlock. Depending on the situation, various strange behavior occurs. -Under X, the output goes to the window in which Lisp was started; on a -terminal, the output will overwrite the Hemlock screen image; in a Hemlock -slave, standard output is \file{/dev/null} by default, so any output is -discarded. - -\hide{File:/afs/cs.cmu.edu/project/clisp/hackers/ram/docs/cmu-user/ipc.ms} - -%%\node Interprocess Communication under LISP, Debugger Programmer's Interface, Alien Objects, Top -\chapter{Interprocess Communication under LISP} -\begin{center} -\b{Written by William Lott and Bill Chiles} -\end{center} -\label{remote} - -CMU Common Lisp offers a facility for interprocess communication (IPC) -on top of using Unix system calls and the complications of that level -of IPC. There is a simple remote-procedure-call (RPC) package build -on top of TCP/IP sockets. - - -\begin{comment} -* The REMOTE Package:: -* The WIRE Package:: -* Out-Of-Band Data:: -\end{comment} - -%%\node The REMOTE Package, The WIRE Package, Interprocess Communication under LISP, Interprocess Communication under LISP -\section{The REMOTE Package} -The \code{remote} package provides simple RPC facility including -interfaces for creating servers, connecting to already existing -servers, and calling functions in other Lisp processes. The routines -for establishing a connection between two processes, -\code{create-request-server} and \code{connect-to-remote-server}, -return \var{wire} structures. A wire maintains the current state of -a connection, and all the RPC forms require a wire to indicate where -to send requests. - - -\begin{comment} -* Connecting Servers and Clients:: -* Remote Evaluations:: -* Remote Objects:: -* Host Addresses:: -\end{comment} - -%%\node Connecting Servers and Clients, Remote Evaluations, The REMOTE Package, The REMOTE Package -\subsection{Connecting Servers and Clients} - -Before a client can connect to a server, it must know the network address on -which the server accepts connections. Network addresses consist of a host -address or name, and a port number. Host addresses are either a string of the -form \code{VANCOUVER.SLISP.CS.CMU.EDU} or a 32 bit unsigned integer. Port -numbers are 16 bit unsigned integers. Note: \var{port} in this context has -nothing to do with Mach ports and message passing. - -When a process wants to receive connection requests (that is, become a -server), it first picks an integer to use as the port. Only one server -(Lisp or otherwise) can use a given port number on a given machine at -any particular time. This can be an iterative process to find a free -port: picking an integer and calling \code{create-request-server}. This -function signals an error if the chosen port is unusable. You will -probably want to write a loop using \code{handler-case}, catching -conditions of type error, since this function does not signal more -specific conditions. - -\begin{defun}{wire:}{create-request-server}{% - \args{\var{port} \ampoptional{} \var{on-connect}}} - - \code{create-request-server} sets up the current Lisp to accept - connections on the given port. If port is unavailable for any - reason, this signals an error. When a client connects to this port, - the acceptance mechanism makes a wire structure and invokes the - \var{on-connect} function. Invoking this function has a couple - purposes, and \var{on-connect} may be \nil{} in which case the - system foregoes invoking any function at connect time. - - The \var{on-connect} function is both a hook that allows you access - to the wire created by the acceptance mechanism, and it confirms the - connection. This function takes two arguments, the wire and the - host address of the connecting process. See the section on host - addresses below. When \var{on-connect} is \nil, the request server - allows all connections. When it is non-\nil, the function returns - two values, whether to accept the connection and a function the - system should call when the connection terminates. Either value may - be \nil, but when the first value is \nil, the acceptance mechanism - destroys the wire. - - \code{create-request-server} returns an object that - \code{destroy-request-server} uses to terminate a connection. -\end{defun} - -\begin{defun}{wire:}{destroy-request-server}{\args{\var{server}}} - - \code{destroy-request-server} takes the result of - \code{create-request-server} and terminates that server. Any - existing connections remain intact, but all additional connection - attempts will fail. -\end{defun} - -\begin{defun}{wire:}{connect-to-remote-server}{% - \args{\var{host} \var{port} \ampoptional{} \var{on-death}}} - - \code{connect-to-remote-server} attempts to connect to a remote - server at the given \var{port} on \var{host} and returns a wire - structure if it is successful. If \var{on-death} is non-\nil, it is - a function the system invokes when this connection terminates. -\end{defun} - - -%%\node Remote Evaluations, Remote Objects, Connecting Servers and Clients, The REMOTE Package -\subsection{Remote Evaluations} -After the server and client have connected, they each have a wire -allowing function evaluation in the other process. This RPC mechanism -has three flavors: for side-effect only, for a single value, and for -multiple values. - -Only a limited number of data types can be sent across wires as -arguments for remote function calls and as return values: integers -inclusively less than 32 bits in length, symbols, lists, and -\var{remote-objects} (\pxlref{remote-objs}). The system sends symbols -as two strings, the package name and the symbol name, and if the -package doesn't exist remotely, the remote process signals an error. -The system ignores other slots of symbols. Lists may be any tree of -the above valid data types. To send other data types you must -represent them in terms of these supported types. For example, you -could use \code{prin1-to-string} locally, send the string, and use -\code{read-from-string} remotely. - -\begin{defmac}{wire:}{remote}{% - \args{\var{wire} \mstar{call-specs}}} - - The \code{remote} macro arranges for the process at the other end of - \var{wire} to invoke each of the functions in the \var{call-specs}. - To make sure the system sends the remote evaluation requests over - the wire, you must call \code{wire-force-output}. - - Each of \var{call-specs} looks like a function call textually, but - it has some odd constraints and semantics. The function position of - the form must be the symbolic name of a function. \code{remote} - evaluates each of the argument subforms for each of the - \var{call-specs} locally in the current context, sending these - values as the arguments for the functions. - - Consider the following example: -\begin{verbatim} -(defun write-remote-string (str) - (declare (simple-string str)) - (wire:remote wire - (write-string str))) -\end{verbatim} - The value of \code{str} in the local process is passed over the wire - with a request to invoke \code{write-string} on the value. The - system does not expect to remotely evaluate \code{str} for a value - in the remote process. -\end{defmac} - -\begin{defun}{wire:}{wire-force-output}{\args{\var{wire}}} - - \code{wire-force-output} flushes all internal buffers associated - with \var{wire}, sending the remote requests. This is necessary - after a call to \code{remote}. -\end{defun} - -\begin{defmac}{wire:}{remote-value}{\args{\var{wire} \var{call-spec}}} - - The \code{remote-value} macro is similar to the \code{remote} macro. - \code{remote-value} only takes one \var{call-spec}, and it returns - the value returned by the function call in the remote process. The - value must be a valid type the system can send over a wire, and - there is no need to call \code{wire-force-output} in conjunction - with this interface. - - If client unwinds past the call to \code{remote-value}, the server - continues running, but the system ignores the value the server sends - back. - - If the server unwinds past the remotely requested call, instead of - returning normally, \code{remote-value} returns two values, \nil{} - and \true. Otherwise this returns the result of the remote - evaluation and \nil. -\end{defmac} - -\begin{defmac}{wire:}{remote-value-bind}{% - \args{\var{wire} (\mstar{variable}) remote-form - \mstar{local-forms}}} - - \code{remote-value-bind} is similar to \code{multiple-value-bind} - except the values bound come from \var{remote-form}'s evaluation in - the remote process. The \var{local-forms} execute in an implicit - \code{progn}. - - If the client unwinds past the call to \code{remote-value-bind}, the - server continues running, but the system ignores the values the - server sends back. - - If the server unwinds past the remotely requested call, instead of - returning normally, the \var{local-forms} never execute, and - \code{remote-value-bind} returns \nil. -\end{defmac} - - -%%\node Remote Objects, Host Addresses, Remote Evaluations, The REMOTE Package -\subsection{Remote Objects} -\label{remote-objs} - -The wire mechanism only directly supports a limited number of data -types for transmission as arguments for remote function calls and as -return values: integers inclusively less than 32 bits in length, -symbols, lists. Sometimes it is useful to allow remote processes to -refer to local data structures without allowing the remote process -to operate on the data. We have \var{remote-objects} to support -this without the need to represent the data structure in terms of -the above data types, to send the representation to the remote -process, to decode the representation, to later encode it again, and -to send it back along the wire. - -You can convert any Lisp object into a remote-object. When you send -a remote-object along a wire, the system simply sends a unique token -for it. In the remote process, the system looks up the token and -returns a remote-object for the token. When the remote process -needs to refer to the original Lisp object as an argument to a -remote call back or as a return value, it uses the remote-object it -has which the system converts to the unique token, sending that -along the wire to the originating process. Upon receipt in the -first process, the system converts the token back to the same -(\code{eq}) remote-object. - -\begin{defun}{wire:}{make-remote-object}{\args{\var{object}}} - - \code{make-remote-object} returns a remote-object that has - \var{object} as its value. The remote-object can be passed across - wires just like the directly supported wire data types. -\end{defun} - -\begin{defun}{wire:}{remote-object-p}{\args{\var{object}}} - - The function \code{remote-object-p} returns \true{} if \var{object} - is a remote object and \nil{} otherwise. -\end{defun} - -\begin{defun}{wire:}{remote-object-local-p}{\args{\var{remote}}} - - The function \code{remote-object-local-p} returns \true{} if - \var{remote} refers to an object in the local process. This is can - only occur if the local process created \var{remote} with - \code{make-remote-object}. -\end{defun} - -\begin{defun}{wire:}{remote-object-eq}{\args{\var{obj1} \var{obj2}}} - - The function \code{remote-object-eq} returns \true{} if \var{obj1} and - \var{obj2} refer to the same (\code{eq}) lisp object, regardless of - which process created the remote-objects. -\end{defun} - -\begin{defun}{wire:}{remote-object-value}{\args{\var{remote}}} - - This function returns the original object used to create the given - remote object. It is an error if some other process originally - created the remote-object. -\end{defun} - -\begin{defun}{wire:}{forget-remote-translation}{\args{\var{object}}} - - This function removes the information and storage necessary to - translate remote-objects back into \var{object}, so the next - \code{gc} can reclaim the memory. You should use this when you no - longer expect to receive references to \var{object}. If some remote - process does send a reference to \var{object}, - \code{remote-object-value} signals an error. -\end{defun} - - -%%\node Host Addresses, , Remote Objects, The REMOTE Package -\subsection{Host Addresses} -The operating system maintains a database of all the valid host -addresses. You can use this database to convert between host names -and addresses and vice-versa. - -\begin{defun}{ext:}{lookup-host-entry}{\args{\var{host}}} - - \code{lookup-host-entry} searches the database for the given - \var{host} and returns a host-entry structure for it. If it fails - to find \var{host} in the database, it returns \nil. \var{Host} is - either the address (as an integer) or the name (as a string) of the - desired host. -\end{defun} - -\begin{defun}{ext:}{host-entry-name}{\args{\var{host-entry}}} - \defunx[ext:]{host-entry-aliases}{\args{\var{host-entry}}} - \defunx[ext:]{host-entry-addr-list}{\args{\var{host-entry}}} - \defunx[ext:]{host-entry-addr}{\args{\var{host-entry}}} - - \code{host-entry-name}, \code{host-entry-aliases}, and - \code{host-entry-addr-list} each return the indicated slot from the - host-entry structure. \code{host-entry-addr} returns the primary - (first) address from the list returned by - \code{host-entry-addr-list}. -\end{defun} - - -%%\node The WIRE Package, Out-Of-Band Data, The REMOTE Package, Interprocess Communication under LISP -\section{The WIRE Package} - -The \code{wire} package provides for sending data along wires. The -\code{remote} package sits on top of this package. All data sent -with a given output routine must be read in the remote process with -the complementary fetching routine. For example, if you send so a -string with \code{wire-output-string}, the remote process must know -to use \code{wire-get-string}. To avoid rigid data transfers and -complicated code, the interface supports sending -\var{tagged} data. With tagged data, the system sends a tag -announcing the type of the next data, and the remote system takes -care of fetching the appropriate type. - -When using interfaces at the wire level instead of the RPC level, -the remote process must read everything sent by these routines. If -the remote process leaves any input on the wire, it will later -mistake the data for an RPC request causing unknown lossage. - -\begin{comment} -* Untagged Data:: -* Tagged Data:: -* Making Your Own Wires:: -\end{comment} - -%%\node Untagged Data, Tagged Data, The WIRE Package, The WIRE Package -\subsection{Untagged Data} -When using these routines both ends of the wire know exactly what types are -coming and going and in what order. This data is restricted to the following -types: -\begin{itemize} - -\item -8 bit unsigned bytes. - -\item -32 bit unsigned bytes. - -\item -32 bit integers. - -\item -simple-strings less than 65535 in length. -\end{itemize} - - -\begin{defun}{wire:}{wire-output-byte}{\args{\var{wire} \var{byte}}} - \defunx[wire:]{wire-get-byte}{\args{\var{wire}}} - \defunx[wire:]{wire-output-number}{\args{\var{wire} \var{number}}} - \defunx[wire:]{wire-get-number}{\args{\var{wire} \ampoptional{} - \var{signed}}} - \defunx[wire:]{wire-output-string}{\args{\var{wire} \var{string}}} - \defunx[wire:]{wire-get-string}{\args{\var{wire}}} - - These functions either output or input an object of the specified - data type. When you use any of these output routines to send data - across the wire, you must use the corresponding input routine - interpret the data. -\end{defun} - - -%%\node Tagged Data, Making Your Own Wires, Untagged Data, The WIRE Package -\subsection{Tagged Data} -When using these routines, the system automatically transmits and interprets -the tags for you, so both ends can figure out what kind of data transfers -occur. Sending tagged data allows a greater variety of data types: integers -inclusively less than 32 bits in length, symbols, lists, and \var{remote-objects} -(\pxlref{remote-objs}). The system sends symbols as two strings, the -package name and the symbol name, and if the package doesn't exist remotely, -the remote process signals an error. The system ignores other slots of -symbols. Lists may be any tree of the above valid data types. To send other -data types you must represent them in terms of these supported types. For -example, you could use \code{prin1-to-string} locally, send the string, and use -\code{read-from-string} remotely. - -\begin{defun}{wire:}{wire-output-object}{% - \args{\var{wire} \var{object} \ampoptional{} \var{cache-it}}} - \defunx[wire:]{wire-get-object}{\args{\var{wire}}} - - The function \code{wire-output-object} sends \var{object} over - \var{wire} preceded by a tag indicating its type. - - If \var{cache-it} is non-\nil, this function only sends \var{object} - the first time it gets \var{object}. Each end of the wire - associates a token with \var{object}, similar to remote-objects, - allowing you to send the object more efficiently on successive - transmissions. \var{cache-it} defaults to \true{} for symbols and - \nil{} for other types. Since the RPC level requires function - names, a high-level protocol based on a set of function calls saves - time in sending the functions' names repeatedly. - - The function \code{wire-get-object} reads the results of - \code{wire-output-object} and returns that object. -\end{defun} - - -%%\node Making Your Own Wires, , Tagged Data, The WIRE Package -\subsection{Making Your Own Wires} -You can create wires manually in addition to the \code{remote} package's -interface creating them for you. To create a wire, you need a Unix \i{file -descriptor}. If you are unfamiliar with Unix file descriptors, see section 2 of -the Unix manual pages. - -\begin{defun}{wire:}{make-wire}{\args{\var{descriptor}}} - - The function \code{make-wire} creates a new wire when supplied with - the file descriptor to use for the underlying I/O operations. -\end{defun} - -\begin{defun}{wire:}{wire-p}{\args{\var{object}}} - - This function returns \true{} if \var{object} is indeed a wire, - \nil{} otherwise. -\end{defun} - -\begin{defun}{wire:}{wire-fd}{\args{\var{wire}}} - - This function returns the file descriptor used by the \var{wire}. -\end{defun} - - -%%\node Out-Of-Band Data, , The WIRE Package, Interprocess Communication under LISP -\section{Out-Of-Band Data} - -The TCP/IP protocol allows users to send data asynchronously, otherwise -known as \var{out-of-band} data. When using this feature, the operating -system interrupts the receiving process if this process has chosen to be -notified about out-of-band data. The receiver can grab this input -without affecting any information currently queued on the socket. -Therefore, you can use this without interfering with any current -activity due to other wire and remote interfaces. - -Unfortunately, most implementations of TCP/IP are broken, so use of -out-of-band data is limited for safety reasons. You can only reliably -send one character at a time. - -This routines in this section provide a mechanism for establishing -handlers for out-of-band characters and for sending them out-of-band. -These all take a Unix file descriptor instead of a wire, but you can -fetch a wire's file descriptor with \code{wire-fd}. - -\begin{defun}{wire:}{add-oob-handler}{\args{\var{fd} \var{char} \var{handler}}} - - The function \code{add-oob-handler} arranges for \var{handler} to be - called whenever \var{char} shows up as out-of-band data on the file - descriptor \var{fd}. -\end{defun} - -\begin{defun}{wire:}{remove-oob-handler}{\args{\var{fd} \var{char}}} - - This function removes the handler for the character \var{char} on - the file descriptor \var{fd}. -\end{defun} - -\begin{defun}{wire:}{remove-all-oob-handlers}{\args{\var{fd}}} - - This function removes all handlers for the file descriptor \var{fd}. -\end{defun} - -\begin{defun}{wire:}{send-character-out-of-band}{\args{\var{fd} \var{char}}} - - This function Sends the character \var{char} down the file - descriptor \var{fd} out-of-band. -\end{defun} - -%% -\hide{File:debug-int.tex} -%%\node Debugger Programmer's Interface, Function Index, Interprocess Communication under LISP, Top -\chapter{Debugger Programmer's Interface} -\label{debug-internals} - -The debugger programmers interface is exported from from the -\code{"DEBUG-INTERNALS"} or \code{"DI"} package. This is a CMU -extension that allows debugging tools to be written without detailed -knowledge of the compiler or run-time system. - -Some of the interface routines take a code-location as an argument. As -described in the section on code-locations, some code-locations are -unknown. When a function calls for a \var{basic-code-location}, it -takes either type, but when it specifically names the argument -\var{code-location}, the routine will signal an error if you give it an -unknown code-location. - -\begin{comment} -* DI Exceptional Conditions:: -* Debug-variables:: -* Frames:: -* Debug-functions:: -* Debug-blocks:: -* Breakpoints:: -* Code-locations:: -* Debug-sources:: -* Source Translation Utilities:: -\end{comment} - -%% -%%\node DI Exceptional Conditions, Debug-variables, Debugger Programmer's Interface, Debugger Programmer's Interface -\section{DI Exceptional Conditions} - -Some of these operations fail depending on the availability debugging -information. In the most severe case, when someone saved a Lisp image -stripping all debugging data structures, no operations are valid. In -this case, even backtracing and finding frames is impossible. Some -interfaces can simply return values indicating the lack of information, -or their return values are naturally meaningful in light missing data. -Other routines, as documented below, will signal -\code{serious-condition}s when they discover awkward situations. This -interface does not provide for programs to detect these situations other -than by calling a routine that detects them and signals a condition. -These are serious-conditions because the program using the interface -must handle them before it can correctly continue execution. These -debugging conditions are not errors since it is no fault of the -programmers that the conditions occur. - -\begin{comment} -* Debug-conditions:: -* Debug-errors:: -\end{comment} - -%%\node Debug-conditions, Debug-errors, DI Exceptional Conditions, DI Exceptional Conditions -\subsection{Debug-conditions} - -The debug internals interface signals conditions when it can't adhere -to its contract. These are serious-conditions because the program -using the interface must handle them before it can correctly continue -execution. These debugging conditions are not errors since it is no -fault of the programmers that the conditions occur. The interface -does not provide for programs to detect these situations other than -calling a routine that detects them and signals a condition. - - -\begin{deftp}{Condition}{debug-condition}{} - -This condition inherits from serious-condition, and all debug-conditions -inherit from this. These must be handled, but they are not programmer errors. -\end{deftp} - - -\begin{deftp}{Condition}{no-debug-info}{} - -This condition indicates there is absolutely no debugging information -available. -\end{deftp} - - -\begin{deftp}{Condition}{no-debug-function-returns}{} - -This condition indicates the system cannot return values from a frame since -its debug-function lacks debug information details about returning values. -\end{deftp} - - -\begin{deftp}{Condition}{no-debug-blocks}{} -This condition indicates that a function was not compiled with debug-block -information, but this information is necessary necessary for some requested -operation. -\end{deftp} - -\begin{deftp}{Condition}{no-debug-variables}{} -Similar to \code{no-debug-blocks}, except that variable information was -requested. -\end{deftp} - -\begin{deftp}{Condition}{lambda-list-unavailable}{} -Similar to \code{no-debug-blocks}, except that lambda list information was -requested. -\end{deftp} - -\begin{deftp}{Condition}{invalid-value}{} - -This condition indicates a debug-variable has \kwd{invalid} or \kwd{unknown} -value in a particular frame. -\end{deftp} - - -\begin{deftp}{Condition}{ambiguous-variable-name}{} - -This condition indicates a user supplied debug-variable name identifies more -than one valid variable in a particular frame. -\end{deftp} - - -%%\node Debug-errors, , Debug-conditions, DI Exceptional Conditions -\subsection{Debug-errors} - -These are programmer errors resulting from misuse of the debugging tools' -programmers' interface. You could have avoided an occurrence of one of these -by using some routine to check the use of the routine generating the error. - - -\begin{deftp}{Condition}{debug-error}{} -This condition inherits from error, and all user programming errors inherit -from this condition. -\end{deftp} - - -\begin{deftp}{Condition}{unhandled-debug-condition}{} -This error results from a signalled \code{debug-condition} occurring -without anyone handling it. -\end{deftp} - - -\begin{deftp}{Condition}{unknown-code-location}{} -This error indicates the invalid use of an unknown-code-location. -\end{deftp} - - -\begin{deftp}{Condition}{unknown-debug-variable}{} - -This error indicates an attempt to use a debug-variable in conjunction with an -inappropriate debug-function; for example, checking the variable's validity -using a code-location in the wrong debug-function will signal this error. -\end{deftp} - - -\begin{deftp}{Condition}{frame-function-mismatch}{} - -This error indicates you called a function returned by -\code{preprocess-for-eval} -on a frame other than the one for which the function had been prepared. -\end{deftp} - - -%% -%%\node Debug-variables, Frames, DI Exceptional Conditions, Debugger Programmer's Interface -\section{Debug-variables} - -Debug-variables represent the constant information about where the system -stores argument and local variable values. The system uniquely identifies with -an integer every instance of a variable with a particular name and package. To -access a value, you must supply the frame along with the debug-variable since -these are particular to a function, not every instance of a variable on the -stack. - -\begin{defun}{}{debug-variable-name}{\args{\var{debug-variable}}} - - This function returns the name of the \var{debug-variable}. The - name is the name of the symbol used as an identifier when writing - the code. -\end{defun} - - -\begin{defun}{}{debug-variable-package}{\args{\var{debug-variable}}} - - This function returns the package name of the \var{debug-variable}. - This is the package name of the symbol used as an identifier when - writing the code. -\end{defun} - - -\begin{defun}{}{debug-variable-symbol}{\args{\var{debug-variable}}} - - This function returns the symbol from interning - \code{debug-variable-name} in the package named by - \code{debug-variable-package}. -\end{defun} - - -\begin{defun}{}{debug-variable-id}{\args{\var{debug-variable}}} - - This function returns the integer that makes \var{debug-variable}'s - name and package name unique with respect to other - \var{debug-variable}'s in the same function. -\end{defun} - - -\begin{defun}{}{debug-variable-validity}{% - \args{\var{debug-variable} \var{basic-code-location}}} - - This function returns three values reflecting the validity of - \var{debug-variable}'s value at \var{basic-code-location}: - \begin{Lentry} - \item[\kwd{valid}] The value is known to be available. - \item[\kwd{invalid}] The value is known to be unavailable. - \item[\kwd{unknown}] The value's availability is unknown. - \end{Lentry} -\end{defun} - - -\begin{defun}{}{debug-variable-value}{\args{\var{debug-variable} - \var{frame}}} - - This function returns the value stored for \var{debug-variable} in - \var{frame}. The value may be invalid. This is \code{SETF}'able. -\end{defun} - - -\begin{defun}{}{debug-variable-valid-value}{% - \args{\var{debug-variable} \var{frame}}} - - This function returns the value stored for \var{debug-variable} in - \var{frame}. If the value is not \kwd{valid}, then this signals an - \code{invalid-value} error. -\end{defun} - - -%% -%%\node Frames, Debug-functions, Debug-variables, Debugger Programmer's Interface -\section{Frames} - -Frames describe a particular call on the stack for a particular thread. This -is the environment for name resolution, getting arguments and locals, and -returning values. The stack conceptually grows up, so the top of the stack is -the most recently called function. - -\code{top-frame}, \code{frame-down}, \code{frame-up}, and -\code{frame-debug-function} can only fail when there is absolutely no -debug information available. This can only happen when someone saved a -Lisp image specifying that the system dump all debugging data. - - -\begin{defun}{}{top-frame}{} - - This function never returns the frame for itself, always the frame - before calling \code{top-frame}. -\end{defun} - - -\begin{defun}{}{frame-down}{\args{\var{frame}}} - - This returns the frame immediately below \var{frame} on the stack. - When \var{frame} is the bottom of the stack, this returns \nil. -\end{defun} - - -\begin{defun}{}{frame-up}{\args{\var{frame}}} - - This returns the frame immediately above \var{frame} on the stack. - When \var{frame} is the top of the stack, this returns \nil. -\end{defun} - - -\begin{defun}{}{frame-debug-function}{\args{\var{frame}}} - - This function returns the debug-function for the function whose call - \var{frame} represents. -\end{defun} - - -\begin{defun}{}{frame-code-location}{\args{\var{frame}}} - - This function returns the code-location where \var{frame}'s - debug-function will continue running when program execution returns - to \var{frame}. If someone interrupted this frame, the result could - be an unknown code-location. -\end{defun} - - -\begin{defun}{}{frame-catches}{\args{\var{frame}}} - - This function returns an a-list for all active catches in - \var{frame} mapping catch tags to the code-locations at which the - catch re-enters. -\end{defun} - - -\begin{defun}{}{eval-in-frame}{\args{\var{frame} \var{form}}} - - This evaluates \var{form} in \var{frame}'s environment. This can - signal several different debug-conditions since its success relies - on a variety of inexact debug information: \code{invalid-value}, - \code{ambiguous-variable-name}, \code{frame-function-mismatch}. See - also \funref{preprocess-for-eval}. -\end{defun} - -\begin{comment} - \begin{defun}{}{return-from-frame}{\args{\var{frame} \var{values}}} - - This returns the elements in the list \var{values} as multiple - values from \var{frame} as if the function \var{frame} represents - returned these values. This signals a - \code{no-debug-function-returns} condition when \var{frame}'s - debug-function lacks information on returning values. - - \i{Not Yet Implemented} - \end{defun} -\end{comment} - -%% -%%\node Debug-functions, Debug-blocks, Frames, Debugger Programmer's Interface -\section {Debug-functions} - -Debug-functions represent the static information about a function determined at -compile time---argument and variable storage, their lifetime information, -etc. The debug-function also contains all the debug-blocks representing -basic-blocks of code, and these contains information about specific -code-locations in a debug-function. - -\begin{defmac}{}{do-debug-function-blocks}{% - \args{(\var{block-var} \var{debug-function} \mopt{result-form}) - \mstar{form}}} - - This executes the forms in a context with \var{block-var} bound to - each debug-block in \var{debug-function} successively. - \var{Result-form} is an optional form to execute for a return value, - and \code{do-debug-function-blocks} returns \nil if there is no - \var{result-form}. This signals a \code{no-debug-blocks} condition - when the \var{debug-function} lacks debug-block information. -\end{defmac} - - -\begin{defun}{}{debug-function-lambda-list}{\args{\var{debug-function}}} - - This function returns a list representing the lambda-list for - \var{debug-function}. The list has the following structure: - \begin{example} - (required-var1 required-var2 - ... - (:optional var3 suppliedp-var4) - (:optional var5) - ... - (:rest var6) (:rest var7) - ... - (:keyword keyword-symbol var8 suppliedp-var9) - (:keyword keyword-symbol var10) - ... - ) - \end{example} - Each \code{var}\var{n} is a debug-variable; however, the symbol - \kwd{deleted} appears instead whenever the argument remains - unreferenced throughout \var{debug-function}. - - If there is no lambda-list information, this signals a - \code{lambda-list-unavailable} condition. -\end{defun} - - -\begin{defmac}{}{do-debug-function-variables}{% - \args{(\var{var} \var{debug-function} \mopt{result}) - \mstar{form}}} - - This macro executes each \var{form} in a context with \var{var} - bound to each debug-variable in \var{debug-function}. This returns - the value of executing \var{result} (defaults to \nil). This may - iterate over only some of \var{debug-function}'s variables or none - depending on debug policy; for example, possibly the compilation - only preserved argument information. -\end{defmac} - - -\begin{defun}{}{debug-variable-info-available}{\args{\var{debug-function}}} - - This function returns whether there is any variable information for - \var{debug-function}. This is useful for distinguishing whether - there were no locals in a function or whether there was no variable - information. For example, if \code{do-debug-function-variables} - executes its forms zero times, then you can use this function to - determine the reason. -\end{defun} - - -\begin{defun}{}{debug-function-symbol-variables}{% - \args{\var{debug-function} \var{symbol}}} - - This function returns a list of debug-variables in - \var{debug-function} having the same name and package as - \var{symbol}. If \var{symbol} is uninterned, then this returns a - list of debug-variables without package names and with the same name - as \var{symbol}. The result of this function is limited to the - availability of variable information in \var{debug-function}; for - example, possibly \var{debug-function} only knows about its - arguments. -\end{defun} - - -\begin{defun}{}{ambiguous-debug-variables}{% - \args{\var{debug-function} \var{name-prefix-string}}} - - This function returns a list of debug-variables in - \var{debug-function} whose names contain \var{name-prefix-string} as - an initial substring. The result of this function is limited to the - availability of variable information in \var{debug-function}; for - example, possibly \var{debug-function} only knows about its - arguments. -\end{defun} - - -\begin{defun}{}{preprocess-for-eval}{% - \args{\var{form} \var{basic-code-location}}} - - This function returns a function of one argument that evaluates - \var{form} in the lexical context of \var{basic-code-location}. - This allows efficient repeated evaluation of \var{form} at a certain - place in a function which could be useful for conditional breaking. - This signals a \code{no-debug-variables} condition when the - code-location's debug-function has no debug-variable information - available. The returned function takes a frame as an argument. See - also \funref{eval-in-frame}. -\end{defun} - - -\begin{defun}{}{function-debug-function}{\args{\var{function}}} - - This function returns a debug-function that represents debug - information for \var{function}. -\end{defun} - - -\begin{defun}{}{debug-function-kind}{\args{\var{debug-function}}} - - This function returns the kind of function \var{debug-function} - represents. The value is one of the following: - \begin{Lentry} - \item[\kwd{optional}] This kind of function is an entry point to an - ordinary function. It handles optional defaulting, parsing - keywords, etc. - \item[\kwd{external}] This kind of function is an entry point to an - ordinary function. It checks argument values and count and calls - the defined function. - \item[\kwd{top-level}] This kind of function executes one or more - random top-level forms from a file. - \item[\kwd{cleanup}] This kind of function represents the cleanup - forms in an \code{unwind-protect}. - \item[\nil] This kind of function is not one of the above; that is, - it is not specially marked in any way. - \end{Lentry} -\end{defun} - - -\begin{defun}{}{debug-function-function}{\args{\var{debug-function}}} - - This function returns the Common Lisp function associated with the - \var{debug-function}. This returns \nil{} if the function is - unavailable or is non-existent as a user callable function object. -\end{defun} - - -\begin{defun}{}{debug-function-name}{\args{\var{debug-function}}} - - This function returns the name of the function represented by - \var{debug-function}. This may be a string or a cons; do not assume - it is a symbol. -\end{defun} - - -%% -%%\node Debug-blocks, Breakpoints, Debug-functions, Debugger Programmer's Interface -\section{Debug-blocks} - -Debug-blocks contain information pertinent to a specific range of code in a -debug-function. - -\begin{defmac}{}{do-debug-block-locations}{% - \args{(\var{code-var} \var{debug-block} \mopt{result}) - \mstar{form}}} - - This macro executes each \var{form} in a context with \var{code-var} - bound to each code-location in \var{debug-block}. This returns the - value of executing \var{result} (defaults to \nil). -\end{defmac} - - -\begin{defun}{}{debug-block-successors}{\args{\var{debug-block}}} - - This function returns the list of possible code-locations where - execution may continue when the basic-block represented by - \var{debug-block} completes its execution. -\end{defun} - - -\begin{defun}{}{debug-block-elsewhere-p}{\args{\var{debug-block}}} - - This function returns whether \var{debug-block} represents elsewhere - code. This is code the compiler has moved out of a function's code - sequence for optimization reasons. Code-locations in these blocks - are unsuitable for stepping tools, and the first code-location has - nothing to do with a normal starting location for the block. -\end{defun} - - -%% -%%\node Breakpoints, Code-locations, Debug-blocks, Debugger Programmer's Interface -\section{Breakpoints} - -A breakpoint represents a function the system calls with the current frame when -execution passes a certain code-location. A break point is active or inactive -independent of its existence. They also have an extra slot for users to tag -the breakpoint with information. - -\begin{defun}{}{make-breakpoint}{% - \args{\var{hook-function} \var{what} \keys{\kwd{kind} \kwd{info} - \kwd{function-end-cookie}}}} - - This function creates and returns a breakpoint. When program - execution encounters the breakpoint, the system calls - \var{hook-function}. \var{hook-function} takes the current frame - for the function in which the program is running and the breakpoint - object. - - \var{what} and \var{kind} determine where in a function the system - invokes \var{hook-function}. \var{what} is either a code-location - or a debug-function. \var{kind} is one of \kwd{code-location}, - \kwd{function-start}, or \kwd{function-end}. Since the starts and - ends of functions may not have code-locations representing them, - designate these places by supplying \var{what} as a debug-function - and \var{kind} indicating the \kwd{function-start} or - \kwd{function-end}. When \var{what} is a debug-function and - \var{kind} is \kwd{function-end}, then hook-function must take two - additional arguments, a list of values returned by the function and - a function-end-cookie. - - \var{info} is information supplied by and used by the user. - - \var{function-end-cookie} is a function. To implement function-end - breakpoints, the system uses starter breakpoints to establish the - function-end breakpoint for each invocation of the function. Upon - each entry, the system creates a unique cookie to identify the - invocation, and when the user supplies a function for this argument, - the system invokes it on the cookie. The system later invokes the - function-end breakpoint hook on the same cookie. The user may save - the cookie when passed to the function-end-cookie function for later - comparison in the hook function. - - This signals an error if \var{what} is an unknown code-location. - - \i{Note: Breakpoints in interpreted code or byte-compiled code are - not implemented. Function-end breakpoints are not implemented for - compiled functions that use the known local return convention - (e.g. for block-compiled or self-recursive functions.)} - -\end{defun} - - -\begin{defun}{}{activate-breakpoint}{\args{\var{breakpoint}}} - - This function causes the system to invoke the \var{breakpoint}'s - hook-function until the next call to \code{deactivate-breakpoint} or - \code{delete-breakpoint}. The system invokes breakpoint hook - functions in the opposite order that you activate them. -\end{defun} - - -\begin{defun}{}{deactivate-breakpoint}{\args{\var{breakpoint}}} - - This function stops the system from invoking the \var{breakpoint}'s - hook-function. -\end{defun} - - -\begin{defun}{}{breakpoint-active-p}{\args{\var{breakpoint}}} - - This returns whether \var{breakpoint} is currently active. -\end{defun} - - -\begin{defun}{}{breakpoint-hook-function}{\args{\var{breakpoint}}} - - This function returns the \var{breakpoint}'s function the system - calls when execution encounters \var{breakpoint}, and it is active. - This is \code{SETF}'able. -\end{defun} - - -\begin{defun}{}{breakpoint-info}{\args{\var{breakpoint}}} - - This function returns \var{breakpoint}'s information supplied by the - user. This is \code{SETF}'able. -\end{defun} - - -\begin{defun}{}{breakpoint-kind}{\args{\var{breakpoint}}} - - This function returns the \var{breakpoint}'s kind specification. -\end{defun} - - -\begin{defun}{}{breakpoint-what}{\args{\var{breakpoint}}} - - This function returns the \var{breakpoint}'s what specification. -\end{defun} - - -\begin{defun}{}{delete-breakpoint}{\args{\var{breakpoint}}} - - This function frees system storage and removes computational - overhead associated with \var{breakpoint}. After calling this, - \var{breakpoint} is useless and can never become active again. -\end{defun} - - -%% -%%\node Code-locations, Debug-sources, Breakpoints, Debugger Programmer's Interface -\section{Code-locations} - -Code-locations represent places in functions where the system has correct -information about the function's environment and where interesting operations -can occur---asking for a local variable's value, setting breakpoints, -evaluating forms within the function's environment, etc. - -Sometimes the interface returns unknown code-locations. These -represent places in functions, but there is no debug information -associated with them. Some operations accept these since they may -succeed even with missing debug data. These operations' argument is -named \var{basic-code-location} indicating they take known and unknown -code-locations. If an operation names its argument -\var{code-location}, and you supply an unknown one, it will signal an -error. For example, \code{frame-code-location} may return an unknown -code-location if someone interrupted Lisp in the given frame. The -system knows where execution will continue, but this place in the code -may not be a place for which the compiler dumped debug information. - -\begin{defun}{}{code-location-debug-function}{\args{\var{basic-code-location}}} - - This function returns the debug-function representing information - about the function corresponding to the code-location. -\end{defun} - - -\begin{defun}{}{code-location-debug-block}{\args{\var{basic-code-location}}} - - This function returns the debug-block containing code-location if it - is available. Some debug policies inhibit debug-block information, - and if none is available, then this signals a \code{no-debug-blocks} - condition. -\end{defun} - - -\begin{defun}{}{code-location-top-level-form-offset}{% - \args{\var{code-location}}} - - This function returns the number of top-level forms before the one - containing \var{code-location} as seen by the compiler in some - compilation unit. A compilation unit is not necessarily a single - file, see the section on debug-sources. -\end{defun} - - -\begin{defun}{}{code-location-form-number}{\args{\var{code-location}}} - - This function returns the number of the form corresponding to - \var{code-location}. The form number is derived by walking the - subforms of a top-level form in depth-first order. While walking - the top-level form, count one in depth-first order for each subform - that is a cons. See \funref{form-number-translations}. -\end{defun} - - -\begin{defun}{}{code-location-debug-source}{\args{\var{code-location}}} - - This function returns \var{code-location}'s debug-source. -\end{defun} - - -\begin{defun}{}{code-location-unknown-p}{\args{\var{basic-code-location}}} - - This function returns whether \var{basic-code-location} is unknown. - It returns \nil when the code-location is known. -\end{defun} - - -\begin{defun}{}{code-location=}{\args{\var{code-location1} - \var{code-location2}}} - - This function returns whether the two code-locations are the same. -\end{defun} - - -%% -%%\node Debug-sources, Source Translation Utilities, Code-locations, Debugger Programmer's Interface -\section{Debug-sources} - -Debug-sources represent how to get back the source for some code. The -source is either a file (\code{compile-file} or \code{load}), a -lambda-expression (\code{compile}, \code{defun}, \code{defmacro}), or -a stream (something particular to CMU Common Lisp, -\code{compile-from-stream}). - -When compiling a source, the compiler counts each top-level form it -processes, but when the compiler handles multiple files as one block -compilation, the top-level form count continues past file boundaries. -Therefore \code{code-location-top-level-form-offset} returns an offset -that does not always start at zero for the code-location's -debug-source. The offset into a particular source is -\code{code-location-top-level-form-offset} minus -\code{debug-source-root-number}. - -Inside a top-level form, a code-location's form number indicates the -subform corresponding to the code-location. - -\begin{defun}{}{debug-source-from}{\args{\var{debug-source}}} - - This function returns an indication of the type of source. The - following are the possible values: - \begin{Lentry} - \item[\kwd{file}] from a file (obtained by \code{compile-file} if - compiled). - \item[\kwd{lisp}] from Lisp (obtained by \code{compile} if - compiled). - \item[\kwd{stream}] from a non-file stream (CMU Common Lisp supports - \code{compile-from-stream}). - \end{Lentry} -\end{defun} - - -\begin{defun}{}{debug-source-name}{\args{\var{debug-source}}} - - This function returns the actual source in some sense represented by - debug-source, which is related to \code{debug-source-from}: - \begin{Lentry} - \item[\kwd{file}] the pathname of the file. - \item[\kwd{lisp}] a lambda-expression. - \item[\kwd{stream}] some descriptive string that's otherwise - useless. -\end{Lentry} -\end{defun} - - -\begin{defun}{}{debug-source-created}{\args{\var{debug-source}}} - - This function returns the universal time someone created the source. - This may be \nil{} if it is unavailable. -\end{defun} - - -\begin{defun}{}{debug-source-compiled}{\args{\var{debug-source}}} - - This function returns the time someone compiled the source. This is - \nil if the source is uncompiled. -\end{defun} - - -\begin{defun}{}{debug-source-root-number}{\args{\var{debug-source}}} - - This returns the number of top-level forms processed by the compiler - before compiling this source. If this source is uncompiled, this is - zero. This may be zero even if the source is compiled since the - first form in the first file compiled in one compilation, for - example, must have a root number of zero---the compiler saw no other - top-level forms before it. -\end{defun} - - -%%\node Source Translation Utilities, , Debug-sources, Debugger Programmer's Interface -\section{Source Translation Utilities} - -These two functions provide a mechanism for converting the rather -obscure (but highly compact) representation of source locations into an -actual source form: - -\begin{defun}{}{debug-source-start-positions}{\args{\var{debug-source}}} - - This function returns the file position of each top-level form a - vector if \var{debug-source} is from a \kwd{file}. If - \code{debug-source-from} is \kwd{lisp} or \kwd{stream}, or the file - is byte-compiled, then the result is \false. -\end{defun} - - -\begin{defun}{}{form-number-translations}{\args{\var{form} - \var{tlf-number}}} - - This function returns a table mapping form numbers (see - \code{code-location-form-number}) to source-paths. A source-path - indicates a descent into the top-level-form \var{form}, going - directly to the subform corresponding to a form number. - \var{tlf-number} is the top-level-form number of \var{form}. -\end{defun} - - -\begin{defun}{}{source-path-context}{% - \args{\var{form} \var{path} \var{context}}} - - This function returns the subform of \var{form} indicated by the - source-path. \var{Form} is a top-level form, and \var{path} is a - source-path into it. \var{Context} is the number of enclosing forms - to return instead of directly returning the source-path form. When - \var{context} is non-zero, the form returned contains a marker, - \code{\#:****HERE****}, immediately before the form indicated by - \var{path}. -\end{defun} - - -%% -\twocolumn -%%\node Function Index, Variable Index, Debugger Programmer's Interface, Top -%%\unnumbered{Function Index} -\cindex{Function Index} - -%%\printindex{fn} -\printindex[funs] - -\twocolumn -%%\node Variable Index, Type Index, Function Index, Top -%%\unnumbered{Variable Index} -\cindex{Variable Index} - -%%\printindex{vr} -\printindex[vars] - -\twocolumn -%%\node Type Index, Concept Index, Variable Index, Top -%%\unnumbered{Type Index} -\cindex{Type Index} - -%%\printindex{tp} -\printindex[types] - -%%\node Concept Index, , Type Index, Top -%%\unnumbered{Concept Index} -\cindex{Concept Index} - -%%\printindex{cp} -\onecolumn -\printindex[concept] -\end{document} diff --git a/doc/cmucl/internals/SBCL-README b/doc/cmucl/internals/SBCL-README deleted file mode 100644 index e541e51..0000000 --- a/doc/cmucl/internals/SBCL-README +++ /dev/null @@ -1,2 +0,0 @@ -things from here which are invaluable for understanding current SBCL: - object.tex diff --git a/doc/cmucl/internals/addenda b/doc/cmucl/internals/addenda deleted file mode 100644 index 0facfc4..0000000 --- a/doc/cmucl/internals/addenda +++ /dev/null @@ -1,16 +0,0 @@ -the function calling convention - -%ECX is used for a count of function argument words, represented as a -fixnum, so it can also be thought of as a count of function argument -bytes. - -The first three arguments are stored in registers. The remaining -arguments are stored on the stack. - -The comments at the head of DEFINE-VOP (MORE-ARG) explain that -;;; More args are stored contiguously on the stack, starting immediately at the -;;; context pointer. The context pointer is not typed, so the lowtag is 0. - -?? Once we switch into more-arg arrangement, %ecx no longer seems to be - used for argument count (judging from my walkthrough of kw arg parsing - code while troubleshooting cold boot problems) \ No newline at end of file diff --git a/doc/cmucl/internals/architecture.tex b/doc/cmucl/internals/architecture.tex deleted file mode 100644 index 8eb24e5..0000000 --- a/doc/cmucl/internals/architecture.tex +++ /dev/null @@ -1,308 +0,0 @@ -\part{System Architecture}% -*- Dictionary: int:design -*- - -\chapter{Package and File Structure} - -\section{RCS and build areas} - -The CMU CL sources are maintained using RCS in a hierarchical directory -structure which supports: -\begin{itemize} -\item shared RCS config file across a build area, - -\item frozen sources for multiple releases, and - -\item separate system build areas for different architectures. -\end{itemize} - -Since this organization maintains multiple copies of the source, it is somewhat -space intensive. But it is easy to delete and later restore a copy of the -source using RCS snapshots. - -There are three major subtrees of the root \verb|/afs/cs/project/clisp|: -\begin{description} -\item[rcs] holds the RCS source (suffix \verb|,v|) files. - -\item[src] holds ``checked out'' (but not locked) versions of the source files, -and is subdivided by release. Each release directory in the source tree has a -symbolic link named ``{\tt RCS}'' which points to the RCS subdirectory of the -corresponding directory in the ``{\tt rcs} tree. At top-level in a source tree -is the ``{\tt RCSconfig}'' file for that area. All subdirectories also have a -symbolic link to this RCSconfig file, allowing the configuration for an area to -be easily changed. - -\item[build] compiled object files are placed in this tree, which is subdivided -by machine type and version. The CMU CL search-list mechanism is used to allow -the source files to be located in a different tree than the object files. C -programs are compiled by using the \verb|tools/dupsrcs| command to make -symbolic links to the corresponding source tree. -\end{description} - -On order to modify an file in RCS, it must be checked out with a lock to -produce a writable working file. Each programmer checks out files into a -personal ``play area'' subtree of \verb|clisp/hackers|. These tree duplicate -the structure of source trees, but are normally empty except for files actively -being worked on. - -See \verb|/afs/cs/project/clisp/pmax_mach/alpha/tools/| for -various tools we use for RCS hacking: -\begin{description} -\item[rcs.lisp] Hemlock (editor) commands for RCS file manipulation - -\item[rcsupdate.c] Program to check out all files in a tree that have been -modified since last checkout. - -\item[updates] Shell script to produce a single listing of all RCS log - entries in a tree since a date. - -\item[snapshot-update.lisp] Lisp program to generate a shell script which -generates a listing of updates since a particular RCS snapshot ({\tt RCSSNAP}) -file was created. -\end{description} - -You can easily operate on all RCS files in a subtree using: -\begin{verbatim} -find . -follow -name '*,v' -exec {} \; -\end{verbatim} - -\subsection{Configuration Management} - -config files are useful, especially in combinarion with ``{\tt snapshot}''. You -can shapshot any particular version, giving an RCSconfig that designates that -configuration. You can also use config files to specify the system as of a -particular date. For example: -\begin{verbatim} -<3-jan-91 -\end{verbatim} -in the the config file will cause the version as of that 3-jan-91 to be checked -out, instead of the latest version. - -\subsection{RCS Branches} - -Branches and named revisions are used together to allow multiple paths of -development to be supported. Each separate development has a branch, and each -branch has a name. This project uses branches in two somewhat different cases -of divergent development: -\begin{itemize} -\item For systems that we have imported from the outside, we generally assign a -``{\tt cmu}'' branch for our local modifications. When a new release comes -along, we check it in on the trunk, and then merge our branch back in. - -\item For the early development and debugging of major system changes, where -the development and debugging is expected to take long enough that we wouldn't -want the trunk to be in an inconsistent state for that long. -\end{itemize} - -\section{Releases} - -We name releases according to the normal alpha, beta, default convention. -Alpha releases are frequent, intended primarily for internal use, and are thus -not subject to as high high documentation and configuration management -standards. Alpha releases are designated by the date on which the system was -built; the alpha releases for different systems may not be in exact -correspondence, since they are built at different times. - -Beta and default releases are always based on a snapshot, ensuring that all -systems are based on the same sources. A release name is an integer and a -letter, like ``15d''. The integer is the name of the source tree which the -system was built from, and the letter represents the release from that tree: -``a'' is the first release, etc. Generally the numeric part increases when -there are major system changes, whereas changes in the letter represent -bug-fixes and minor enhancements. - -\section{Source Tree Structure} - -A source tree (and the master ``{\tt rcs}'' tree) has subdirectories for each -major subsystem: -\begin{description} -\item[{\tt assembly/}] Holds the CMU CL source-file assembler, and has machine -specific subdirectories holding assembly code for that architecture. - -\item[{\tt clx/}] The CLX interface to the X11 window system. - -\item[{\tt code/}] The Lisp code for the runtime system and standard CL -utilities. - -\item[{\tt compiler/}] The Python compiler. Has architecture-specific -subdirectories which hold backends for different machines. The {\tt generic} -subdirectory holds code that is shared across most backends. - -\item[{\tt hemlock/}] The Hemlock editor. - -\item[{\tt lisp/}] The C runtime system code and low-level Lisp debugger. - -\item[{\tt pcl/}] CMU version of the PCL implementation of CLOS. - -\item[{\tt tools/}] System building command files and source management tools. -\end{description} - - -\section{Package structure} - -Goals: with the single exception of LISP, we want to be able to export from the -package that the code lives in. - -\begin{description} -\item[Mach, CLX...] --- These Implementation-dependent system-interface -packages provide direct access to specific features available in the operating -system environment, but hide details of how OS communication is done. - -\item[system] contains code that must know about the operating system -environment: I/O, etc. Hides the operating system environment. Provides OS -interface extensions such as {\tt print-directory}, etc. - -\item[kernel] hides state and types used for system integration: package -system, error system, streams (?), reader, printer. Also, hides the VM, in -that we don't export anything that reveals the VM interface. Contains code -that needs to use the VM and SYSTEM interface, but is independent of OS and VM -details. This code shouldn't need to be changed in any port of CMU CL, but -won't work when plopped into an arbitrary CL. Uses SYSTEM, VM, EXTENSIONS. We -export "hidden" symbols related to implementation of CL: setf-inverses, -possibly some global variables. - -The boundary between KERNEL and VM is fuzzy, but this fuzziness reflects the -fuzziness in the definition of the VM. We can make the VM large, and bring -everything inside, or we make make it small. Obviously, we want the VM to be -as small as possible, subject to efficiency constraints. Pretty much all of -the code in KERNEL could be put in VM. The issue is more what VM hides from -KERNEL: VM knows about everything. - -\item[lisp] Originally, this package had all the system code in it. The -current ideal is that this package should have {\it no} code in it, and only -exist to export the standard interface. Note that the name has been changed by -x3j13 to common-lisp. - -\item[extensions] contains code that any random user could have written: list -operations, syntactic sugar macros. Uses only LISP, so code in EXTENSIONS is -pure CL. Exports everything defined within that is useful elsewhere. This -package doesn't hide much, so it is relatively safe for users to use -EXTENSIONS, since they aren't getting anything they couldn't have written -themselves. Contrast this to KERNEL, which exports additional operations on -CL's primitive data structures: PACKAGE-INTERNAL-SYMBOL-COUNT, etc. Although -some of the functionality exported from KERNEL could have been defined in CL, -the kernel implementation is much more efficient because it knows about -implementation internals. Currently this package contains only extensions to -CL, but in the ideal scheme of things, it should contain the implementations of -all CL functions that are in KERNEL (the library.) - -\item[VM] hides information about the hardware and data structure -representations. Contains all code that knows about this sort of thing: parts -of the compiler, GC, etc. The bulk of the code is the compiler back-end. -Exports useful things that are meaningful across all implementations, such as -operations for examining compiled functions, system constants. Uses COMPILER -and whatever else it wants. Actually, there are different {\it machine}{\tt --VM} packages for each target implementation. VM is a nickname for whatever -implementation we are currently targeting for. - - -\item[compiler] hides the algorithms used to map Lisp semantics onto the -operations supplied by the VM. Exports the mechanisms used for defining the -VM. All the VM-independent code in the compiler, partially hiding the compiler -intermediate representations. Uses KERNEL. - -\item[eval] holds code that does direct execution of the compiler's ICR. Uses -KERNEL, COMPILER. Exports debugger interface to interpreted code. - -\item[debug-internals] presents a reasonable, unified interface to -manipulation of the state of both compiled and interpreted code. (could be in -KERNEL) Uses VM, INTERPRETER, EVAL, KERNEL. - -\item[debug] holds the standard debugger, and exports the debugger -\end{description} - -\chapter{System Building} - -It's actually rather easy to build a CMU CL core with exactly what you want in -it. But to do this you need two things: the source and a working CMU CL. - -Basically, you use the working copy of CMU CL to compile the sources, -then run a process call ``genesis'' which builds a ``kernel'' core. -You then load whatever you want into this kernel core, and save it. - -In the \verb|tools/| directory in the sources there are several files that -compile everything, and build cores, etc. The first step is to compile the C -startup code. - -{\bf Note:} {\it the various scripts mentioned below have hard-wired paths in -them set up for our directory layout here at CMU. Anyone anywhere else will -have to edit them before they will work.} - -\section{Compiling the C Startup Code} - -There is a circular dependancy between lisp/internals.h and lisp/lisp.map that -causes bootstrapping problems. To the easiest way to get around this problem -is to make a fake lisp.nm file that has nothing in it by a version number: - -\begin{verbatim} - % echo "Map file for lisp version 0" > lisp.nm -\end{verbatim} -and then run genesis with NIL for the list of files: -\begin{verbatim} - * (load ".../compiler/generic/new-genesis") ; compile before loading - * (lisp::genesis nil ".../lisp/lisp.nm" "/dev/null" - ".../lisp/lisp.map" ".../lisp/lisp.h") -\end{verbatim} -It will generate -a whole bunch of warnings about things being undefined, but ignore -that, because it will also generate a correct lisp.h. You can then -compile lisp producing a correct lisp.map: -\begin{verbatim} - % make -\end{verbatim} -and the use \verb|tools/do-worldbuild| and \verb|tools/mk-lisp| to build -\verb|kernel.core| and \verb|lisp.core| (see section \ref[building-cores].) - -\section{Compiling the Lisp Code} - -The \verb|tools| directory contains various lisp and C-shell utilities for -building CMU CL: -\begin{description} -\item[compile-all*] Will compile lisp files and build a kernel core. It has -numerous command-line options to control what to compile and how. Try -help to -see a description. It runs a separate Lisp process to compile each -subsystem. Error output is generated in files with ``{\tt .log}'' extension in -the root of the build area. - -\item[setup.lisp] Some lisp utilities used for compiling changed files in batch -mode and collecting the error output Sort of a crude defsystem. Loads into the -``user'' package. See {\tt with-compiler-log-file} and {\tt comf}. - -\item[{\it foo}com.lisp] Each system has a ``\verb|.lisp|'' file in -\verb|tools/| which compiles that system. -\end{description} - -\section{Building Core Images} -\label{building-cores} -Both the kernel and final core build are normally done using shell script -drivers: -\begin{description} -\item[do-worldbuild*] Builds a kernel core for the current machine. The -version to build is indicated by an optional argument, which defaults to -``alpha''. The \verb|kernel.core| file is written either in the \verb|lisp/| -directory in the build area, or in \verb|/usr/tmp/|. The directory which -already contains \verb|kernel.core| is chosen. You can create a dummy version -with e.g. ``touch'' to select the initial build location. - -\item[mk-lisp*] Builds a full core, with conditional loading of subsystems. -The version is the first argument, which defaults to ``alpha''. Any additional -arguments are added to the \verb|*features*| list, which controls system -loading (among other things.) The \verb|lisp.core| file is written in the -current working directory. -\end{description} - -These scripts load Lisp command files. When \verb|tools/worldbuild.lisp| is -loaded, it calls genesis with the correct arguments to build a kernel core. -Similarly, \verb|worldload.lisp| -builds a full core. Adding certain symbols to \verb|*features*| before -loading worldload.lisp suppresses loading of different parts of the -system. These symbols are: -\begin{description} -\item[:no-compiler] don't load the compiler. -\item[:no-clx] don't load CLX. -\item[:no-hemlock] don't load hemlock. -\item[:no-pcl] don't load PCL. -\item[:runtime] build a runtime code, implies all of the above, and then some. -\end{description} - -Note: if you don't load the compiler, you can't (successfully) load the -pretty-printer or pcl. And if you compiled hemlock with CLX loaded, you can't -load it without CLX also being loaded. diff --git a/doc/cmucl/internals/back.tex b/doc/cmucl/internals/back.tex deleted file mode 100644 index edeff46..0000000 --- a/doc/cmucl/internals/back.tex +++ /dev/null @@ -1,725 +0,0 @@ -% -*- Dictionary: design -*- - -\chapter{Copy propagation} - -File: {\tt copyprop} - -This phase is optional, but should be done whenever speed or space is more -important than compile speed. We use global flow analysis to find the reaching -definitions for each TN. This information is used here to eliminate -unnecessary TNs, and is also used later on by loop invariant optimization. - -In some cases, VMR conversion will unnecessarily copy the value of a TN into -another TN, since it may not be able to tell that the initial TN has the same -value at the time the second TN is referenced. This can happen when ICR -optimize is unable to eliminate a trivial variable binding, or when the user -does a setq, or may also result from creation of expression evaluation -temporaries during VMR conversion. Whatever the cause, we would like to avoid -the unnecessary creation and assignment of these TNs. - -What we do is replace TN references whose only reaching definition is a Move -VOP with a reference to the TN moved from, and then delete the Move VOP if the -copy TN has no remaining references. There are several restrictions on copy -propagation: -\begin{itemize} -\item The TNs must be ``ordinary'' TNs, not restricted or otherwise -unusual. Extending the life of restricted (or wired) TNs can make register -allocation impossible. Some other TN kinds have hidden references. - -\item We don't want to defeat source-level debugging by replacing named -variables with anonymous temporaries. - -\item We can't delete moves that representation selected might want to change -into a representation conversion, since we need the primitive types of both TNs -to select a conversion. -\end{itemize} - -Some cleverness reduces the cost of flow analysis. As for lifetime analysis, -we only need to do flow analysis on global packed TNs. We can't do the real -local TN assignment pass before this, since we allocate TNs afterward, so we do -a pre-pass that marks the TNs that are local for our purposes. We don't care -if block splitting eventually causes some of them to be considered global. - -Note also that we are really only are interested in knowing if there is a -unique reaching definition, which we can mash into our flow analysis rules by -doing an intersection. Then a definition only appears in the set when it is -unique. We then propagate only definitions of TNs with only one write, which -allows the TN to stand for the definition. - - -\chapter{Representation selection} - -File: {\tt represent} - -Some types of object (such as {\tt single-float}) have multiple possible -representations. Multiple representations are useful mainly when there is a -particularly efficient non-descriptor representation. In this case, there is -the normal descriptor representation, and an alternate non-descriptor -representation. - -This possibility brings up two major issues: -\begin{itemize} -\item The compiler must decide which representation will be most efficient for -any given value, and - -\item Representation conversion code must be inserted where the representation -of a value is changed. -\end{itemize} -First, the representations for TNs are selected by examining all the TN -references and attempting to minimize reference costs. Then representation -conversion code is introduced. - -This phase is in effect a pre-pass to register allocation. The main reason for -its existence is that representation conversions may be farily complex (e.g. -involving memory allocation), and thus must be discovered before register -allocation. - - -VMR conversion leaves stubs for representation specific move operations. -Representation selection recognizes {\tt move} by name. Argument and return -value passing for call VOPs is controlled by the {\tt :move-arguments} option -to {\tt define-vop}. - -Representation selection is also responsible for determining what functions use -the number stack. If any representation is chosen which could involve packing -into the {\tt non-descriptor-stack} SB, then we allocate the NFP register -throughout the component. As an optimization, permit the decision of whether a -number stack frame needs to be allocated to be made on a per-function basis. -If a function doesn't use the number stack, and isn't in the same tail-set as -any function that uses the number stack, then it doesn't need a number stack -frame, even if other functions in the component do. - - -\chapter{Lifetime analysis} - -File: {\tt life} - -This phase is a preliminary to Pack. It involves three passes: - -- A pre-pass that computes the DEF and USE sets for live TN analysis, while - also assigning local TN numbers, splitting blocks if necessary. \#\#\# But -not really... - -- A flow analysis pass that does backward flow analysis on the - component to find the live TNs at each block boundary. - -- A post-pass that finds the conflict set for each TN. - -\#| -Exploit the fact that a single VOP can only exhaust LTN numbers when there are -large more operands. Since more operand reference cannot be interleaved with -temporary reference, the references all effectively occur at the same time. -This means that we can assign all the more args and all the more results the -same LTN number and the same lifetime info. -|\# - - -\section{Flow analysis} - -It seems we could use the global-conflicts structures during compute the -inter-block lifetime information. The pre-pass creates all the -global-conflicts for blocks that global TNs are referenced in. The flow -analysis pass just adds always-live global-conflicts for the other blocks the -TNs are live in. In addition to possibly being more efficient than SSets, this -would directly result in the desired global-conflicts information, rather that -having to create it from another representation. - -The DFO sorted per-TN global-conflicts thread suggests some kind of algorithm -based on the manipulation of the sets of blocks each TN is live in (which is -what we really want), rather than the set of TNs live in each block. - -If we sorted the per-TN global-conflicts in reverse DFO (which is just as good -for determining conflicts between TNs), then it seems we could scan though the -conflicts simultaneously with our flow-analysis scan through the blocks. - -The flow analysis step is the following: - If a TN is always-live or read-before-written in a successor block, then we - make it always-live in the current block unless there are already - global-conflicts recorded for that TN in this block. - -The iteration terminates when we don't add any new global-conflicts during a -pass. - -We may also want to promote TNs only read within a block to always-live when -the TN is live in a successor. This should be easy enough as long as the -global-conflicts structure contains this kind of info. - -The critical operation here is determining whether a given global TN has global -conflicts in a given block. Note that since we scan the blocks in DFO, and the -global-conflicts are sorted in DFO, if we give each global TN a pointer to the -global-conflicts for the last block we checked the TN was in, then we can -guarantee that the global-conflicts we are looking for are always at or after -that pointer. If we need to insert a new structure, then the pointer will help -us rapidly find the place to do the insertion.] - - -\section{Conflict detection} - -[\#\#\# Environment, :more TNs.] - -This phase makes use of the results of lifetime analysis to find the set of TNs -that have lifetimes overlapping with those of each TN. We also annotate call -VOPs with information about the live TNs so that code generation knows which -registers need to be saved. - -The basic action is a backward scan of each block, looking at each TN-Ref and -maintaining a set of the currently live TNs. When we see a read, we check if -the TN is in the live set. If not, we: - -- Add the TN to the conflict set for every currently live TN, - -- Union the set of currently live TNs with the conflict set for the TN, and - -- Add the TN to the set of live TNs. - -When we see a write for a live TN, we just remove it from the live set. If we -see a write to a dead TN, then we update the conflicts sets as for a read, but -don't add the TN to the live set. We have to do this so that the bogus write -doesn't clobber anything. - -[We don't consider always-live TNs at all in this process, since the conflict -of always-live TNs with other TNs in the block is implicit in the -global-conflicts structures. - -Before we do the scan on a block, we go through the global-conflicts structures -of TNs that change liveness in the block, assigning the recorded LTN number to -the TN's LTN number for the duration of processing of that block.] - - -Efficiently computing and representing this information calls for some -cleverness. It would be prohibitively expensive to represent the full conflict -set for every TN with sparse sets, as is done at the block-level. Although it -wouldn't cause non-linear behavior, it would require a complex linked structure -containing tens of elements to be created for every TN. Fortunately we can -improve on this if we take into account the fact that most TNs are "local" TNs: -TNs which have all their uses in one block. - -First, many global TNs will be either live or dead for the entire duration of a -given block. We can represent the conflict between global TNs live throughout -the block and TNs local to the block by storing the set of always-live global -TNs in the block. This reduces the number of global TNs that must be -represented in the conflicts for local TNs. - -Second, we can represent conflicts within a block using bit-vectors. Each TN -that changes liveness within a block is assigned a local TN number. Local -conflicts are represented using a fixed-size bit-vector of 64 elements or so -which has a 1 for the local TN number of every TN live at that time. The block -has a simple-vector which maps from local TN numbers to TNs. Fixed-size -vectors reduce the hassle of doing allocations and allow operations to be -open-coded in a maximally tense fashion. - -We can represent the conflicts for a local TN by a single bit-vector indexed by -the local TN numbers for that block, but in the global TN case, we need to be -able to represent conflicts with arbitrary TNs. We could use a list-like -sparse set representation, but then we would have to either special-case global -TNs by using the sparse representation within the block, or convert the local -conflicts bit-vector to the sparse representation at the block end. Instead, -we give each global TN a list of the local conflicts bit-vectors for each block -that the TN is live in. If the TN is always-live in a block, then we record -that fact instead. This gives us a major reduction in the amount of work we -have to do in lifetime analysis at the cost of some increase in the time to -iterate over the set during Pack. - -Since we build the lists of local conflict vectors a block at a time, the -blocks in the lists for each TN will be sorted by the block number. The -structure also contains the local TN number for the TN in that block. These -features allow pack to efficiently determine whether two arbitrary TNs -conflict. You just scan the lists in order, skipping blocks that are in only -one list by using the block numbers. When we find a block that both TNs are -live in, we just check the local TN number of one TN in the local conflicts -vector of the other. - -In order to do these optimizations, we must do a pre-pass that finds the -always-live TNs and breaks blocks up into small enough pieces so that we don't -run out of local TN numbers. If we can make a block arbitrarily small, then we -can guarantee that an arbitrarily small number of TNs change liveness within -the block. We must be prepared to make the arguments to unbounded arg count -VOPs (such as function call) always-live even when they really aren't. This is -enabled by a panic mode in the block splitter: if we discover that the block -only contains one VOP and there are still too many TNs that aren't always-live, -then we promote the arguments (which we'd better be able to do...). - -This is done during the pre-scan in lifetime analysis. We can do this because -all TNs that change liveness within a block can be found by examining that -block: the flow analysis only adds always-live TNs. - - -When we are doing the conflict detection pass, we set the LTN number of global -TNs. We can easily detect global TNs that have not been locally mapped because -this slot is initially null for global TNs and we null it out after processing -each block. We assign all Always-Live TNs to the same local number so that we -don't need to treat references to them specially when making the scan. - -We also annotate call VOPs that do register saving with the TNs that are live -during the call, and thus would need to be saved if they are packed in -registers. - -We adjust the costs for TNs that need to be saved so that TNs costing more to -save and restore than to reference get packed on the stack. We would also like -more often saved TNs to get higher costs so that they are packed in more -savable locations. - - -\chapter{Packing} - -File: {\tt pack} - -\#| - -Add lifetime/pack support for pre-packed save TNs. - -Fix GTN/VMR conversion to use pre-packed save TNs for old-cont and return-PC. -(Will prevent preference from passing location to save location from ever being -honored?) - -We will need to make packing of passing locations smarter before we will be -able to target the passing location on the stack in a tail call (when that is -where the callee wants it.) Currently, we will almost always pack the passing -location in a register without considering whether that is really a good idea. -Maybe we should consider schemes that explicitly understand the parallel -assignment semantics, and try to do the assignment with a minimum number of -temporaries. We only need assignment temps for TNs that appear both as an -actual argument value and as a formal parameter of the called function. This -only happens in self-recursive functions. - -Could be a problem with lifetime analysis, though. The write by a move-arg VOP -would look like a write in the current env, when it really isn't. If this is a -problem, then we might want to make the result TN be an info arg rather than a -real operand. But this would only be a problem in recursive calls, anyway. -[This would prevent targeting, but targeting across passing locations rarely -seems to work anyway.] [\#\#\# But the :ENVIRONMENT TN mechanism would get -confused. Maybe put env explicitly in TN, and have it only always-live in that -env, and normal in other envs (or blocks it is written in.) This would allow -targeting into environment TNs. - -I guess we would also want the env/PC save TNs normal in the return block so -that we can target them. We could do this by considering env TNs normal in -read blocks with no successors. - -ENV TNs would be treated totally normally in non-env blocks, so we don't have -to worry about lifetime analysis getting confused by variable initializations. -Do some kind of TN costing to determine when it is more trouble than it is -worth to allocate TNs in registers. - -Change pack ordering to be less pessimal. Pack TNs as they are seen in the LTN -map in DFO, which at least in non-block compilations has an effect something -like packing main trace TNs first, since control analysis tries to put the good -code first. This could also reduce spilling, since it makes it less likely we -will clog all registers with global TNs. - -If we pack a TN with a specified save location on the stack, pack in the -specified location. - -Allow old-cont and return-pc to be kept in registers by adding a new "keep -around" kind of TN. These are kind of like environment live, but are only -always-live in blocks that they weren't referenced in. Lifetime analysis does -a post-pass adding always-live conflicts for each "keep around" TN to those -blocks with no conflict for that TN. The distinction between always-live and -keep-around allows us to successfully target old-cont and return-pc to passing -locations. MAKE-KEEP-AROUND-TN (ptype), PRE-PACK-SAVE-TN (tn scn offset). -Environment needs a KEEP-AROUND-TNS slot so that conflict analysis can find -them (no special casing is needed after then, they can be made with :NORMAL -kind). VMR-component needs PRE-PACKED-SAVE-TNS so that conflict analysis or -somebody can copy conflict info from the saved TN. - - - -Note that having block granularity in the conflict information doesn't mean -that a localized packing scheme would have to do all moves at block boundaries -(which would clash with the desire the have saving done as part of this -mechanism.) All that it means is that if we want to do a move within the -block, we would need to allocate both locations throughout that block (or -something). - - - - - -Load TN pack: - -A location is out for load TN packing if: - -The location has TN live in it after the VOP for a result, or before the VOP -for an argument, or - -The location is used earlier in the TN-ref list (after) the saved results ref -or later in the TN-Ref list (before) the loaded argument's ref. - -To pack load TNs, we advance the live-tns to the interesting VOP, then -repeatedly scan the vop-refs to find vop-local conflicts for each needed load -TN. We insert move VOPs and change over the TN-Ref-TNs as we go so the TN-Refs -will reflect conflicts with already packed load-TNs. - -If we fail to pack a load-TN in the desired SC, then we scan the Live-TNs for -the SB, looking for a TN that can be packed in an unbounded SB. This TN must -then be repacked in the unbounded SB. It is important the load-TNs are never -packed in unbounded SBs, since that would invalidate the conflicts info, -preventing us from repacking TNs in unbounded SBs. We can't repack in a finite -SB, since there might have been load TNs packed in that SB which aren't -represented in the original conflict structures. - -Is it permissible to "restrict" an operand to an unbounded SC? Not impossible -to satisfy as long as a finite SC is also allowed. But in practice, no -restriction would probably be as good. - -We assume all locations can be used when an sc is based on an unbounded sb. - -] - - -TN-Refs are be convenient structures to build the target graph out of. If we -allocated space in every TN-Ref, then there would certainly be enough to -represent arbitrary target graphs. Would it be enough to allocate a single -Target slot? If there is a target path though a given VOP, then the Target of -the write ref would be the read, and vice-versa. To find all the TNs that -target us, we look at the TN for the target of all our write refs. - -We separately chain together the read refs and the write refs for a TN, -allowing easy determination of things such as whether a TN has only a single -definition or has no reads. It would also allow easier traversal of the target -graph. - -Represent per-location conflicts as vectors indexed by block number of -per-block conflict info. To test whether a TN conflicts on a location, we -would then have to iterate over the TNs global-conflicts, using the block -number and LTN number to check for a conflict in that block. But since most -TNs are local, this test actually isn't much more expensive than indexing into -a bit-vector by GTN numbers. - -The big win of this scheme is that it is much cheaper to add conflicts into the -conflict set for a location, since we never need to actually compute the -conflict set in a list-like representation (which requires iterating over the -LTN conflicts vectors and unioning in the always-live TNs). Instead, we just -iterate over the global-conflicts for the TN, using BIT-IOR to combine the -conflict set with the bit-vector for that block in that location, or marking -that block/location combination as being always-live if the conflict is -always-live. - -Generating the conflict set is inherently more costly, since although we -believe the conflict set size to be roughly constant, it can easily contain -tens of elements. We would have to generate these moderately large lists for -all TNs, including local TNs. In contrast, the proposed scheme does work -proportional to the number of blocks the TN is live in, which is small on -average (1 for local TNs). This win exists independently from the win of not -having to iterate over LTN conflict vectors. - - -[\#\#\# Note that since we never do bitwise iteration over the LTN conflict -vectors, part of the motivation for keeping these a small fixed size has been -removed. But it would still be useful to keep the size fixed so that we can -easily recycle the bit-vectors, and so that we could potentially have maximally -tense special primitives for doing clear and bit-ior on these vectors.] - -This scheme is somewhat more space-intensive than having a per-location -bit-vector. Each vector entry would be something like 150 bits rather than one -bit, but this is mitigated by the number of blocks being 5-10x smaller than the -number of TNs. This seems like an acceptable overhead, a small fraction of the -total VMR representation. - -The space overhead could also be reduced by using something equivalent to a -two-dimensional bit array, indexed first by LTN numbers, and then block numbers -(instead of using a simple-vector of separate bit-vectors.) This would -eliminate space wastage due to bit-vector overheads, which might be 50% or -more, and would also make efficient zeroing of the vectors more -straightforward. We would then want efficient operations for OR'ing LTN -conflict vectors with rows in the array. - -This representation also opens a whole new range of allocation algorithms: ones -that store allocate TNs in different locations within different portions of the -program. This is because we can now represent a location being used to hold a -certain TN within an arbitrary subset of the blocks the TN is referenced in. - - - - - - - - - -Pack goals: - -Pack should: - -Subject to resource constraints: - -- Minimize use costs - -- "Register allocation" - Allocate as many values as possible in scarce "good" locations, - attempting to minimize the aggregate use cost for the entire program. - -- "Save optimization" - Don't allocate values in registers when the save/restore costs exceed - the expected gain for keeping the value in a register. (Similar to - "opening costs" in RAOC.) [Really just a case of representation - selection.] - - -- Minimize preference costs - Eliminate as many moves as possible. - - -"Register allocation" is basically an attempt to eliminate moves between -registers and memory. "Save optimization" counterbalances "register -allocation" to prevent it from becoming a pessimization, since saves can -introduce register/memory moves. - -Preference optimization reduces the number of moves within an SC. Doing a good -job of honoring preferences is important to the success of the compiler, since -we have assumed in many places that moves will usually be optimized away. - -The scarcity-oriented aspect of "register allocation" is handled by a greedy -algorithm in pack. We try to pack the "most important" TNs first, under the -theory that earlier packing is more likely to succeed due to fewer constraints. - -The drawback of greedy algorithms is their inability to look ahead. Packing a -TN may mess up later "register allocation" by precluding packing of TNs that -are individually "less important", but more important in aggregate. Packing a -TN may also prevent preferences from being honored. - - - -Initial packing: - - -Pack all TNs restricted to a finite SC first, before packing any other TNs. - -One might suppose that Pack would have to treat TNs in different environments -differently, but this is not the case. Pack simply assigns TNs to locations so -that no two conflicting TNs are in the same location. In the process of -implementing call semantics in conflict analysis, we cause TNs in different -environments not to conflict. In the case of passing TNs, cross environment -conflicts do exist, but this reflects reality, since the passing TNs are -live in both the caller and the callee. Environment semantics has already been -implemented at this point. - -This means that Pack can pack all TNs simultaneously, using one data structure -to represent the conflicts for each location. So we have only one conflict set -per SB location, rather than separating this information by environment -environment. - - -Load TN packing: - -We create load TNs as needed in a post-pass to the initial packing. After TNs -are packed, it may be that some references to a TN will require it to be in a -SC other than the one it was packed in. We create load-TNs and pack them on -the fly during this post-pass. - -What we do is have an optional SC restriction associated with TN-refs. If we -pack the TN in an SC which is different from the required SC for the reference, -then we create a TN for each such reference, and pack it into the required SC. - -In many cases we will be able to pack the load TN with no hassle, but in -general we may need to spill a TN that has already been packed. We choose a -TN that isn't in use by the offending VOP, and then spill that TN onto the -stack for the duration of that VOP. If the VOP is a conditional, then we must -insert a new block interposed before the branch target so that the value TN -value is restored regardless of which branch is taken. - -Instead of remembering lifetime information from conflict analysis, we rederive -it. We scan each block backward while keeping track of which locations have -live TNs in them. When we find a reference that needs a load TN packed, we try -to pack it in an unused location. If we can't, we unpack the currently live TN -with the lowest cost and force it into an unbounded SC. - -The per-location and per-TN conflict information used by pack doesn't -need to be updated when we pack a load TN, since we are done using those data -structures. - -We also don't need to create any TN-Refs for load TNs. [??? How do we keep -track of load-tn lifetimes? It isn't really that hard, I guess. We just -remember which load TNs we created at each VOP, killing them when we pass the -loading (or saving) step. This suggests we could flush the Refs thread if we -were willing to sacrifice some flexibility in explicit temporary lifetimes. -Flushing the Refs would make creating the VMR representation easier.] - -The lifetime analysis done during load-TN packing doubles as a consistency -check. If we see a read of a TN packed in a location which has a different TN -currently live, then there is a packing bug. If any of the TNs recorded as -being live at the block beginning are packed in a scarce SB, but aren't current -in that location, then we also have a problem. - -The conflict structure for load TNs is fairly simple, the load TNs for -arguments and results all conflict with each other, and don't conflict with -much else. We just try packing in targeted locations before trying at random. - - - -\chapter{Code generation} - -This is fairly straightforward. We translate VOPs into instruction sequences -on a per-block basis. - -After code generation, the VMR representation is gone. Everything is -represented by the assembler data structures. - - -\chapter{Assembly} - -In effect, we do much of the work of assembly when the compiler is compiled. - -The assembler makes one pass fixing up branch offsets, then squeezes out the -space left by branch shortening and dumps out the code along with the load-time -fixup information. The assembler also deals with dumping unboxed non-immediate -constants and symbols. Boxed constants are created by explicit constructor -code in the top-level form, while immediate constants are generated using -inline code. - -[\#\#\# The basic output of the assembler is: - A code vector - A representation of the fixups along with indices into the code vector for - the fixup locations - A PC map translating PCs into source paths - -This information can then be used to build an output file or an in-core -function object. -] - -The assembler is table-driven and supports arbitrary instruction formats. As -far as the assembler is concerned, an instruction is a bit sequence that is -broken down into subsequences. Some of the subsequences are constant in value, -while others can be determined at assemble or load time. - -Assemble Node Form* - Allow instructions to be emitted during the evaluation of the Forms by - defining Inst as a local macro. This macro caches various global - information in local variables. Node tells the assembler what node - ultimately caused this code to be generated. This is used to create the - pc=>source map for the debugger. - -Assemble-Elsewhere Node Form* - Similar to Assemble, but the current assembler location is changed to - somewhere else. This is useful for generating error code and similar - things. Assemble-Elsewhere may not be nested. - -Inst Name Arg* - Emit the instruction Name with the specified arguments. - -Gen-Label -Emit-Label (Label) - Gen-Label returns a Label object, which describes a place in the code. - Emit-Label marks the current position as being the location of Label. - - - -\chapter{Dumping} - -So far as input to the dumper/loader, how about having a list of Entry-Info -structures in the VMR-Component? These structures contain all information -needed to dump the associated function objects, and are only implicitly -associated with the functional/XEP data structures. Load-time constants that -reference these function objects should specify the Entry-Info, rather than the -functional (or something). We would then need to maintain some sort of -association so VMR conversion can find the appropriate Entry-Info. -Alternatively, we could initially reference the functional, and then later -clobber the reference to the Entry-Info. - -We have some kind of post-pass that runs after assembly, going through the -functions and constants, annotating the VMR-Component for the benefit of the -dumper: - Resolve :Label load-time constants. - Make the debug info. - Make the entry-info structures. - -Fasl dumper and in-core loader are implementation (but not instruction set) -dependent, so we want to give them a clear interface. - -open-fasl-file name => fasl-file - Returns a "fasl-file" object representing all state needed by the dumper. - We objectify the state, since the fasdumper should be reentrant. (but - could fail to be at first.) - -close-fasl-file fasl-file abort-p - Close the specified fasl-file. - -fasl-dump-component component code-vector length fixups fasl-file - Dump the code, constants, etc. for component. Code-Vector is a vector - holding the assembled code. Length is the number of elements of Vector - that are actually in use. Fixups is a list of conses (offset . fixup) - describing the locations and things that need to be fixed up at load time. - If the component is a top-level component, then the top-level lambda will - be called after the component is loaded. - -load-component component code-vector length fixups - Like Fasl-Dump-Component, but directly installs the code in core, running - any top-level code immediately. (???) but we need some way to glue - together the componenents, since we don't have a fasl table. - - - -Dumping: - -Dump code for each component after compiling that component, but defer dumping -of other stuff. We do the fixups on the code vectors, and accumulate them in -the table. - -We have to grovel the constants for each component after compiling that -component so that we can fix up load-time constants. Load-time constants are -values needed my the code that are computed after code generation/assembly -time. Since the code is fixed at this point, load-time constants are always -represented as non-immediate constants in the constant pool. A load-time -constant is distinguished by being a cons (Kind . What), instead of a Constant -leaf. Kind is a keyword indicating how the constant is computed, and What is -some context. - -Some interesting load-time constants: - - (:label .