doc/cmucl/internals/architecture.tex

   1 \part{System Architecture}% -*- Dictionary: int:design -*-
   2
   3 \chapter{Package and File Structure}
   4
   5 \section{RCS and build areas}
   6
   7 The CMU CL sources are maintained using RCS in a hierarchical directory
   8 structure which supports:
   9 \begin{itemize}
  10 \item    shared RCS config file across a build area,
  11
  12 \item    frozen sources for multiple releases, and
  13
  14 \item    separate system build areas for different architectures.
  15 \end{itemize}
  16
  17 Since this organization maintains multiple copies of the source, it is somewhat
  18 space intensive.  But it is easy to delete and later restore a copy of the
  19 source using RCS snapshots.
  20
  21 There are three major subtrees of the root \verb|/afs/cs/project/clisp|:
  22 \begin{description}
  23 \item[rcs] holds the RCS source (suffix \verb|,v|) files.
  24
  25 \item[src] holds ``checked out'' (but not locked) versions of the source files,
  26 and is subdivided by release.  Each release directory in the source tree has a
  27 symbolic link named ``{\tt RCS}'' which points to the RCS subdirectory of the
  28 corresponding directory in the ``{\tt rcs} tree.  At top-level in a source tree
  29 is the ``{\tt RCSconfig}'' file for that area.  All subdirectories also have a
  30 symbolic link to this RCSconfig file, allowing the configuration for an area to
  31 be easily changed.
  32
  33 \item[build] compiled object files are placed in this tree, which is subdivided
  34 by machine type and version.  The CMU CL search-list mechanism is used to allow
  35 the source files to be located in a different tree than the object files.  C
  36 programs are compiled by using the \verb|tools/dupsrcs| command to make
  37 symbolic links to the corresponding source tree.
  38 \end{description}
  39
  40 On order to modify an file in RCS, it must be checked out with a lock to
  41 produce a writable working file.  Each programmer checks out files into a
  42 personal ``play area'' subtree of \verb|clisp/hackers|.  These tree duplicate
  43 the structure of source trees, but are normally empty except for files actively
  44 being worked on.
  45
  46 See \verb|/afs/cs/project/clisp/pmax_mach/alpha/tools/| for
  47 various tools we use for RCS hacking:
  48 \begin{description}
  49 \item[rcs.lisp] Hemlock (editor) commands for RCS file manipulation
  50
  51 \item[rcsupdate.c] Program to check out all files in a tree that have been
  52 modified since last checkout.
  53
  54 \item[updates] Shell script to produce a single listing of all RCS log
  55  entries in a tree since a date.
  56
  57 \item[snapshot-update.lisp] Lisp program to generate a shell script which
  58 generates a listing of updates since a particular RCS snapshot ({\tt RCSSNAP})
  59 file was created.
  60 \end{description}
  61
  62 You can easily operate on all RCS files in a subtree using:
  63 \begin{verbatim}
  64 find . -follow -name '*,v' -exec <some command> {} \;
  65 \end{verbatim}
  66
  67 \subsection{Configuration Management}
  68
  69 config files are useful, especially in combinarion with ``{\tt snapshot}''.  You
  70 can shapshot any particular version, giving an RCSconfig that designates that
  71 configuration.  You can also use config files to specify the system as of a
  72 particular date.  For example:
  73 \begin{verbatim}
  74 <3-jan-91
  75 \end{verbatim}
  76 in the the config file will cause the version as of that 3-jan-91 to be checked
  77 out, instead of the latest version.
  78
  79 \subsection{RCS Branches}
  80
  81 Branches and named revisions are used together to allow multiple paths of
  82 development to be supported.  Each separate development has a branch, and each
  83 branch has a name.  This project uses branches in two somewhat different cases
  84 of divergent development:
  85 \begin{itemize}
  86 \item For systems that we have imported from the outside, we generally assign a
  87 ``{\tt cmu}'' branch for our local modifications.  When a new release comes
  88 along, we check it in on the trunk, and then merge our branch back in.
  89
  90 \item For the early development and debugging of major system changes, where
  91 the development and debugging is expected to take long enough that we wouldn't
  92 want the trunk to be in an inconsistent state for that long.
  93 \end{itemize}
  94
  95 \section{Releases}
  96
  97 We name releases according to the normal alpha, beta, default convention.
  98 Alpha releases are frequent, intended primarily for internal use, and are thus
  99 not subject to as high high documentation and configuration management
 100 standards.  Alpha releases are designated by the date on which the system was
 101 built; the alpha releases for different systems may not be in exact
 102 correspondence, since they are built at different times.
 103
 104 Beta and default releases are always based on a snapshot, ensuring that all
 105 systems are based on the same sources.  A release name is an integer and a
 106 letter, like ``15d''.  The integer is the name of the source tree which the
 107 system was built from, and the letter represents the release from that tree:
 108 ``a'' is the first release, etc.  Generally the numeric part increases when
 109 there are major system changes, whereas changes in the letter represent
 110 bug-fixes and minor enhancements.
 111
 112 \section{Source Tree Structure}
 113
 114 A source tree (and the master ``{\tt rcs}'' tree) has subdirectories for each
 115 major subsystem:
 116 \begin{description}
 117 \item[{\tt assembly/}] Holds the CMU CL source-file assembler, and has machine
 118 specific subdirectories holding assembly code for that architecture.
 119
 120 \item[{\tt clx/}] The CLX interface to the X11 window system.
 121
 122 \item[{\tt code/}] The Lisp code for the runtime system and standard CL
 123 utilities.
 124
 125 \item[{\tt compiler/}] The Python compiler.  Has architecture-specific
 126 subdirectories which hold backends for different machines.  The {\tt generic}
 127 subdirectory holds code that is shared across most backends.
 128
 129 \item[{\tt hemlock/}] The Hemlock editor.
 130
 131 \item[{\tt lisp/}] The C runtime system code and low-level Lisp debugger.
 132
 133 \item[{\tt pcl/}] CMU version of the PCL implementation of CLOS.
 134
 135 \item[{\tt tools/}] System building command files and source management tools.
 136 \end{description}
 137
 138 \f
 139 \section{Package structure}
 140
 141 Goals: with the single exception of LISP, we want to be able to export from the
 142 package that the code lives in.
 143
 144 \begin{description}
 145 \item[Mach, CLX...] --- These Implementation-dependent system-interface
 146 packages provide direct access to specific features available in the operating
 147 system environment, but hide details of how OS communication is done.
 148
 149 \item[system] contains code that must know about the operating system
 150 environment: I/O, etc.  Hides the operating system environment.  Provides OS
 151 interface extensions such as {\tt print-directory}, etc.
 152
 153 \item[kernel] hides state and types used for system integration: package
 154 system, error system, streams (?), reader, printer.  Also, hides the VM, in
 155 that we don't export anything that reveals the VM interface.  Contains code
 156 that needs to use the VM and SYSTEM interface, but is independent of OS and VM
 157 details.  This code shouldn't need to be changed in any port of CMU CL, but
 158 won't work when plopped into an arbitrary CL.  Uses SYSTEM, VM, EXTENSIONS.  We
 159 export "hidden" symbols related to implementation of CL: setf-inverses,
 160 possibly some global variables.
 161
 162 The boundary between KERNEL and VM is fuzzy, but this fuzziness reflects the
 163 fuzziness in the definition of the VM.  We can make the VM large, and bring
 164 everything inside, or we make make it small.  Obviously, we want the VM to be
 165 as small as possible, subject to efficiency constraints.  Pretty much all of
 166 the code in KERNEL could be put in VM.  The issue is more what VM hides from
 167 KERNEL: VM knows about everything.
 168
 169 \item[lisp]  Originally, this package had all the system code in it.  The
 170 current ideal is that this package should have {\it no} code in it, and only
 171 exist to export the standard interface.  Note that the name has been changed by
 172 x3j13 to common-lisp.
 173
 174 \item[extensions] contains code that any random user could have written: list
 175 operations, syntactic sugar macros.  Uses only LISP, so code in EXTENSIONS is
 176 pure CL.  Exports everything defined within that is useful elsewhere.  This
 177 package doesn't hide much, so it is relatively safe for users to use
 178 EXTENSIONS, since they aren't getting anything they couldn't have written
 179 themselves.  Contrast this to KERNEL, which exports additional operations on
 180 CL's primitive data structures: PACKAGE-INTERNAL-SYMBOL-COUNT, etc.  Although
 181 some of the functionality exported from KERNEL could have been defined in CL,
 182 the kernel implementation is much more efficient because it knows about
 183 implementation internals.  Currently this package contains only extensions to
 184 CL, but in the ideal scheme of things, it should contain the implementations of
 185 all CL functions that are in KERNEL (the library.)
 186
 187 \item[VM] hides information about the hardware and data structure
 188 representations.  Contains all code that knows about this sort of thing: parts
 189 of the compiler, GC, etc.  The bulk of the code is the compiler back-end.
 190 Exports useful things that are meaningful across all implementations, such as
 191 operations for examining compiled functions, system constants.  Uses COMPILER
 192 and whatever else it wants.  Actually, there are different {\it machine}{\tt
 193 -VM} packages for each target implementation.  VM is a nickname for whatever
 194 implementation we are currently targeting for.
 195
 196
 197 \item[compiler] hides the algorithms used to map Lisp semantics onto the
 198 operations supplied by the VM.  Exports the mechanisms used for defining the
 199 VM.  All the VM-independent code in the compiler, partially hiding the compiler
 200 intermediate representations.  Uses KERNEL.
 201
 202 \item[eval] holds code that does direct execution of the compiler's ICR.  Uses
 203 KERNEL, COMPILER.  Exports debugger interface to interpreted code.
 204
 205 \item[debug-internals] presents a reasonable, unified interface to
 206 manipulation of the state of both compiled and interpreted code.  (could be in
 207 KERNEL) Uses VM, INTERPRETER, EVAL, KERNEL.
 208
 209 \item[debug] holds the standard debugger, and exports the debugger
 210 \end{description}
 211
 212 \chapter{System Building}
 213
 214 It's actually rather easy to build a CMU CL core with exactly what you want in
 215 it.  But to do this you need two things: the source and a working CMU CL.
 216
 217 Basically, you use the working copy of CMU CL to compile the sources,
 218 then run a process call ``genesis'' which builds a ``kernel'' core.
 219 You then load whatever you want into this kernel core, and save it.
 220
 221 In the \verb|tools/| directory in the sources there are several files that
 222 compile everything, and build cores, etc.  The first step is to compile the C
 223 startup code.
 224
 225 {\bf Note:} {\it the various scripts mentioned below have hard-wired paths in
 226 them set up for our directory layout here at CMU.  Anyone anywhere else will
 227 have to edit them before they will work.}
 228
 229 \section{Compiling the C Startup Code}
 230
 231 There is a circular dependancy between lisp/internals.h and lisp/lisp.map that
 232 causes bootstrapping problems.  To the easiest way to get around this problem
 233 is to make a fake lisp.nm file that has nothing in it by a version number:
 234
 235 \begin{verbatim}
 236         % echo "Map file for lisp version 0" > lisp.nm
 237 \end{verbatim}
 238 and then run genesis with NIL for the list of files:
 239 \begin{verbatim}
 240         * (load ".../compiler/generic/new-genesis") ; compile before loading
 241         * (lisp::genesis nil ".../lisp/lisp.nm" "/dev/null"
 242                 ".../lisp/lisp.map" ".../lisp/lisp.h")
 243 \end{verbatim}
 244 It will generate
 245 a whole bunch of warnings about things being undefined, but ignore
 246 that, because it will also generate a correct lisp.h.  You can then
 247 compile lisp producing a correct lisp.map:
 248 \begin{verbatim}
 249         % make
 250 \end{verbatim}
 251 and the use \verb|tools/do-worldbuild| and \verb|tools/mk-lisp| to build
 252 \verb|kernel.core| and \verb|lisp.core| (see section \ref[building-cores].)
 253
 254 \section{Compiling the Lisp Code}
 255
 256 The \verb|tools| directory contains various lisp and C-shell utilities for
 257 building CMU CL:
 258 \begin{description}
 259 \item[compile-all*] Will compile lisp files and build a kernel core.  It has
 260 numerous command-line options to control what to compile and how.  Try -help to
 261 see a description.  It runs a separate Lisp process to compile each
 262 subsystem.  Error output is generated in files with ``{\tt .log}'' extension in
 263 the root of the build area.
 264
 265 \item[setup.lisp] Some lisp utilities used for compiling changed files in batch
 266 mode and collecting the error output Sort of a crude defsystem.  Loads into the
 267 ``user'' package.  See {\tt with-compiler-log-file} and {\tt comf}.
 268
 269 \item[{\it foo}com.lisp] Each system has a ``\verb|.lisp|'' file in
 270 \verb|tools/| which compiles that system.
 271 \end{description}
 272
 273 \section{Building Core Images}
 274 \label{building-cores}
 275 Both the kernel and final core build are normally done using shell script
 276 drivers:
 277 \begin{description}
 278 \item[do-worldbuild*] Builds a kernel core for the current machine.  The
 279 version to build is indicated by an optional argument, which defaults to
 280 ``alpha''.  The \verb|kernel.core| file is written either in the \verb|lisp/|
 281 directory in the build area, or in \verb|/usr/tmp/|.  The directory which
 282 already contains \verb|kernel.core| is chosen.  You can create a dummy version
 283 with e.g. ``touch'' to select the initial build location.
 284
 285 \item[mk-lisp*] Builds a full core, with conditional loading of subsystems.
 286 The version is the first argument, which defaults to ``alpha''.  Any additional
 287 arguments are added to the \verb|*features*| list, which controls system
 288 loading (among other things.)  The \verb|lisp.core| file is written in the
 289 current working directory.
 290 \end{description}
 291
 292 These scripts load Lisp command files.  When \verb|tools/worldbuild.lisp| is
 293 loaded, it calls genesis with the correct arguments to build a kernel core.
 294 Similarly, \verb|worldload.lisp|
 295 builds a full core.  Adding certain symbols to \verb|*features*| before
 296 loading worldload.lisp suppresses loading of different parts of the
 297 system.  These symbols are:
 298 \begin{description}
 299 \item[:no-compiler] don't load the compiler.
 300 \item[:no-clx] don't load CLX.
 301 \item[:no-hemlock] don't load hemlock.
 302 \item[:no-pcl] don't load PCL.
 303 \item[:runtime] build a runtime code, implies all of the above, and then some.
 304 \end{description}
 305
 306 Note: if you don't load the compiler, you can't (successfully) load the
 307 pretty-printer or pcl.  And if you compiled hemlock with CLX loaded, you can't
 308 load it without CLX also being loaded.