doc/cmucl/internals/debugger.tex

   1 %                                       -*- Dictionary: design; Package: C -*-
   2
   3 \#|
   4 \chapter{Debugger Information}
   5 \index{debugger information}
   6 \label{debug-info}
   7
   8 Although the compiler's great freedom in choice of function call conventions
   9 and variable representations has major efficiency advantages, it also has
  10 unfortunate consequences for the debugger.  The debug information that we need
  11 is even more elaborate than for conventional "compiled" languages, since we
  12 cannot even do a simple backtrace without some debug information.  However,
  13 once having gone this far, it is not that difficult to go the extra distance,
  14 and provide full source level debugging of compiled code.
  15
  16 Full debug information has a substantial space penalty, so we allow different
  17 levels of debug information to be specified.  In the extreme case, we can
  18 totally omit debug information.
  19
  20 \f
  21 \section{The Debug-Info Structure}
  22 \index{debug-info structure}
  23
  24 The Debug-Info structure directly represents information about the
  25 source code, and points to other structures that describe the layout of
  26 run-time data structures.
  27
  28
  29 Make some sort of minimal debug-info format that would support at least the
  30 common cases of level 1 (since that is what we would release), and perhaps
  31 level 0.  Actually, it seems it wouldn't be hard to crunch nearly all of the
  32 debug-function structure and debug-info function map into a single byte-vector.
  33 We could have an uncrunch function that restored the current format.  This
  34 would be used by the debugger, and also could be used by purify to delete parts
  35 of the debug-info even when the compiler dumps it in crunched form.
  36 [Note that this isn't terribly important if purify is smart about
  37 debug-info...]
  38 |\#
  39
  40 \f
  41 Compiled source map representation:
  42
  43 [\#\#\# store in debug-function PC at which env is properly initialized, i.e.
  44 args (and return-pc, etc.) in internal locations.  This is where a
  45 :function-start breakpoint would break.]
  46
  47 [\#\#\# Note that that we can easily cache the form-number => source-path or
  48 form-number => form translation using a vector indexed by form numbers that we
  49 build during a walk.]
  50
  51
  52
  53
  54 Instead of using source paths in the debug-info, use "form numbers".  The form
  55 number of a form is the number of forms that we walk to reach that form when
  56 doing a pre-order walk of the source form.  [Might want to use a post-order
  57 walk, as that would more closely approximate evaluation order.]
  58
  59
  60 We probably want to continue using source-paths in the compiler, since they are
  61 quick to compute and to get you to a particular form.  [\#\#\# But actually, I
  62 guess we don't have to precompute the source paths and annotate nodes with
  63 them: instead we could annotate the nodes with the actual original source form.
  64 Then if we wanted to find the location of that form, we could walk the root
  65 source form, looking that original form.  But we might still need to enter all
  66 the forms in a hashtable so that we can tell during IR1 conversion that a given
  67 form appeared in the original source.]
  68
  69
  70 Note that form numbers have an interesting property: it is quite efficient to
  71 determine whether an arbitrary form is a subform of some other form, since the
  72 form number of B will be > than A's number and < A's next sibling's number iff
  73 B is a subform of A.
  74
  75 This should be quite useful for doing the source=>pc mapping in the debugger,
  76 since that problem reduces to finding the subset of the known locations that
  77 are for subforms of the specified form.
  78
  79
  80 Assume a byte vector with a standard variable-length integer format, something
  81 like this:
  82     0..253 => the integer
  83     254 => read next two bytes for integer
  84     255 => read next four bytes for integer
  85
  86 Then a compiled debug block is just a sequence of variable-length integers in a
  87 particular order, something like this:
  88     number of successors
  89     ...offsets of each successor in the function's blocks vector...
  90     first PC
  91     [offset of first top-level form (in forms) (only if not component default)]
  92     form number of first source form
  93     first live mask (length in bytes determined by number of VARIABLES)
  94     ...more <PC, top-level form offset, form-number, live-set> tuples...
  95
  96 We determine the number of locations recorded in a block by the finding the
  97 start of the next compiled debug block in the blocks vector.
  98
  99 [\#\#\# Actually, only need 2 bits for number of successors {0,1,2}.  We might
 100 want to use other bits in the first byte to indicate the kind of location.]
 101 [\#\#\# We could support local packing by having a general concept of "alternate
 102 locations" instead of just regular and save locations.  The location would have
 103 a bit indicating that there are alternate locations, in which case we read the
 104 number of alternate locations and then that many more SC-OFFSETs.  In the
 105 debug-block, we would have a second bit mask with bits set for TNs that are in
 106 an alternate location.  We then read a number for each such TN, with the value
 107 being interpreted as an index into the Location's alternate locations.]
 108
 109
 110
 111 It looks like using structures for the compiled-location-info is too bulky.
 112 Instead we need some packed binary representation.
 113
 114 First, let's represent a SC/offset pair with an "SC-Offset", which is an
 115 integer with the SC in the low 5 bits and the offset in the remaining bits:
 116     ----------------------------------------------------
 117     | Offset (as many bits as necessary) | SC (5 bits) |
 118     ----------------------------------------------------
 119 Probably the result should be constrained to fit in a fixnum, since it will be
 120 more efficient and gives more than enough possible offsets.
 121
 122 We can the represent a compiled location like this:
 123     single byte of boolean flags:
 124         uninterned name
 125         packaged name
 126         environment-live
 127         has distinct save location
 128         has ID (name not unique in this fun)
 129     name length in bytes (as var-length integer)
 130     ...name bytes...
 131     [if packaged, var-length integer that is package name length]
 132      ...package name bytes...]
 133     [If has ID, ID as var-length integer]
 134     SC-Offset of primary location (as var-length integer)
 135     [If has save SC, SC-Offset of save location (as var-length integer)]
 136
 137
 138 \f
 139
 140 But for a whizzy breakpoint facility, we would need a good source=>code map.
 141 Dumping a complete code=>source map might be as good a way as any to represent
 142 this, due to the one-to-many relationship between source and code locations.
 143
 144 We might be able to get away with just storing the source locations for the
 145 beginnings of blocks and maintaining a mapping from code ranges to blocks.
 146 This would be fine both for the profiler and for the "where am I running now"
 147 indication.  Users might also be convinced that it was most interesting to
 148 break at block starts, but I don't really know how easily people could develop
 149 an understanding of basic blocks.
 150
 151 It could also be a bit tricky to map an arbitrary user-designated source
 152 location to some "closest" source location actually in the debug info.
 153 This problem probably exists to some degree even with a full source map, since
 154 some forms will never appear as the source of any node.  It seems you might
 155 have to negotiate with the user.  He would mouse something, and then you would
 156 highlight some source form that has a common prefix (i.e. is a prefix of the
 157 user path, or vice-versa.)  If they aren't happy with the result, they could
 158 try something else.  In some cases, the designated path might be a prefix of
 159 several paths.  This ambiguity might be resolved by picking the shortest path
 160 or letting the user choose.
 161
 162 At the primitive level, I guess what this means is that the structure of source
 163 locations (i.e. source paths) must be known, and the source=>code operation
 164 should return a list of <source,code> pairs, rather than just a list of code
 165 locations.  This allows the debugger to resolve the ambiguity however it wants.
 166
 167 I guess the formal definition of which source paths we would return is:
 168     All source paths in the debug info that have a maximal common prefix with
 169     the specified path.  i.e. if several paths have the complete specified path
 170     as a prefix, we return them all.  Otherwise, all paths with an equally
 171     large common prefix are returned: if the path with the most in common
 172     matches only the first three elements, then we return all paths that match
 173     in the first three elements.  As a degenerate case (which probably
 174     shouldn't happen), if there is no path with anything in common, then we
 175     return *all* of the paths.
 176
 177
 178
 179 In the DEBUG-SOURCE structure we may ultimately want a vector of the start
 180 positions of each source form, since that would make it easier for the debugger
 181 to locate the source.  It could just open the file, FILE-POSITION to the form,
 182 do a READ, then loop down the source path.  Of course, it could read each form
 183 starting from the beginning, but that might be too slow.
 184
 185
 186 Do XEPs really need Debug-Functions?  The only time that we will commonly end
 187 up in the debugger on an XEP is when an argument type check fails.  But I
 188 suppose it would be nice to be able to print the arguments passed...
 189
 190
 191 Note that assembler-level code motion such as pipeline reorganization can cause
 192 problems with our PC maps.  The assembler needs to know that debug info markers
 193 are different from real labels anyway, so I suppose it could inhibit motion
 194 across debug markers conditional on policy.  It seems unworthwhile to remember
 195 the node for each individual instruction.
 196
 197
 198 For tracing block-compiled calls:
 199     Info about return value passing locations?
 200     Info about where all the returns are?
 201
 202 We definitely need the return-value passing locations for debug-return.  The
 203 question is what the interface should be.  We don't really want to have a
 204 visible debug-function-return-locations operation, since there are various
 205 value passing conventions, and we want to paper over the differences.
 206
 207
 208 Probably should be a compiler option to initialize stack frame to a special
 209 uninitialized object (some random immediate type).  This would aid debugging,
 210 and would also help GC problems.  For the latter reason especially, this should
 211 be locally-turn-onable (off of policy?  the new debug-info quality?).
 212
 213
 214 What about the interface between the evaluator and the debugger? (i.e. what
 215 happens on an error, etc.)  Compiler error handling should be integrated with
 216 run-time error handling.  Ideally the error messages should look the same.
 217 Practically, in some cases the run-time errors will have less information.  But
 218 the error should look the same to the debugger (or at least similar).
 219
 220
 221 \f
 222 ;;;; Debugger interface:
 223
 224 How does the debugger interface to the "evaluator" (where the evaluator means
 225 all of native code, byte-code and interpreted IR1)?  It seems that it would be
 226 much more straightforward to have a consistent user interface to debugging
 227 all code representations if there was a uniform debugger interface to the
 228 underlying stuff, and vice-versa.
 229
 230 Of course, some operations might not be supported by some representations, etc.
 231 For example, fine-control stepping might not be available in native code.
 232 In other cases, we might reduce an operation to the lowest common denominator,
 233 for example fetching lexical variables by string and admitting the possibility
 234 of ambiguous matches.  [Actually, it would probably be a good idea to store the
 235 package if we are going to allow variables to be closed over.]
 236
 237 Some objects we would need:
 238 Location:
 239         The constant information about the place where a value is stored,
 240         everything but which particular frame it is in.  Operations:
 241         location name, type, etc.
 242         location-value frame location (setf'able)
 243         monitor-location location function
 244             Function is called whenever location is set with the location,
 245             frame and old value.  If active values aren't supported, then we
 246             dummy the effect using breakpoints, in which case the change won't
 247             be noticed until the end of the block (and intermediate changes
 248             will be lost.)
 249 debug info:
 250         All the debug information for a component.
 251 Frame:
 252         frame-changed-locations frame => location*
 253             Return a list of the locations in frame that were changed since the
 254             last time this function was called.  Or something.  This is for
 255             displaying interesting state changes at breakpoints.
 256         save-frame-state frame => frame-state
 257         restore-frame-state frame frame-state
 258             These operations allow the debugger to back up evaluation, modulo
 259             side-effects and non-local control transfers.  This copies and
 260             restores all variables, temporaries, etc, local to the frame, and
 261             also the current PC and dynamic environment (current catch, etc.)
 262
 263             At the time of the save, the frame must be for the running function
 264             (not waiting for a call to return.)  When we restore, the frame
 265             becomes current again, effectively exiting from any frames on top.
 266             (Of course, frame must not already be exited.)
 267
 268 Thread:
 269         Representation of which stack to use, etc.
 270 Block:
 271         What successors the block has, what calls there are in the block.
 272         (Don't need to know where calls are as long as we know called function,
 273         since can breakpoint at the function.)  Whether code in this block is
 274         wildly out of order due to being the result of loop-invariant
 275         optimization, etc.  Operations:
 276         block-successors block => code-location*
 277         block-forms block => (source-location code-location)*
 278             Return the corresponding source locations and code locations for
 279             all forms (and form fragments) in the block.
 280
 281 \f
 282 Variable maps:
 283
 284 There are about five things that the debugger might want to know about a
 285 variable:
 286
 287     Name
 288         Although a lexical variable's name is "really" a symbol (package and
 289         all), in practice it doesn't seem worthwhile to require all the symbols
 290         for local variable names to be retained.  There is much less VM and GC
 291         overhead for a constant string than for a symbol.  (Also it is useful
 292         to be able to access gensyms in the debugger, even though they are
 293         theoretically ineffable).
 294
 295     ID
 296         Which variable with the specified name is this?  It is possible to have
 297         multiple variables with the same name in a given function.  The ID is
 298         something that makes Name unique, probably a small integer.  When
 299         variables aren't unique, we could make this be part of the name, e.g.
 300         "FOO\#1", "FOO\#2".  But there are advantages to keeping this separate,
 301         since in many cases lifetime information can be used to disambiguate,
 302         making qualification unnecessary.
 303
 304     SC
 305         When unboxed representations are in use, we must have type information
 306         to properly read and write a location.  We only need to know the
 307         SC for this, which would be amenable to a space-saving
 308         numeric encoding.
 309
 310     Location
 311         Simple: the offset in SC.  [Actually, we need the save location too.]
 312
 313     Lifetime
 314         In what parts of the program does this variable hold a meaningful
 315         value?  It seems prohibitive to record precise lifetime information,
 316         both in space and compiler effort, so we will have to settle for some
 317         sort of approximation.
 318
 319         The finest granularity at which it is easy to determine liveness is the
 320         the block: we can regard the variable lifetime as the set of blocks
 321         that the variable is live in.  Of course, the variable may be dead (and
 322         thus contain meaningless garbage) during arbitrarily large portions of
 323         the block.
 324
 325         Note that this subsumes the notion of which function a variable belongs
 326         to.  A given block is only in one function, so the function is
 327         implicit.
 328
 329
 330 The variable map should represent this information space-efficiently and with
 331 adequate computational efficiency.
 332
 333 The SC and ID can be represented as small integers.  Although the ID can in
 334 principle be arbitrarily large, it should be <100 in practice.  The location
 335 can be represented by just the offset (a moderately small integer), since the
 336 SB is implicit in the SC.
 337
 338 The lifetime info can be represented either as a bit-vector indexed by block
 339 numbers, or by a list of block numbers.  Which is more compact depends both on
 340 the size of the component and on the number of blocks the variable is live in.
 341 In the limit of large component size, the sparse representation will be more
 342 compact, but it isn't clear where this crossover occurs.  Of course, it would
 343 be possible to use both representations, choosing the more compact one on a
 344 per-variable basis.  Another interesting special case is when the variable is
 345 live in only one block: this may be common enough to be worth picking off,
 346 although it is probably rarer for named variables than for TNs in general.
 347
 348 If we dump the type, then a normal list-style type descriptor is fine: the
 349 space overhead is small, since the shareability is high.
 350
 351 We could probably save some space by cleverly representing the var-info as
 352 parallel vectors of different types, but this would be more painful in use.
 353 It seems better to just use a structure, encoding the unboxed fields in a
 354 fixnum.  This way, we can pass around the structure in the debugger, perhaps
 355 even exporting it from the the low-level debugger interface.
 356
 357 [\#\#\# We need the save location too.  This probably means that we need two slots
 358 of bits, since we need the save offset and save SC.  Actually, we could let the
 359 save SC be implied by the normal SC, since at least currently, we always choose
 360 the same save SC for a given SC.  But even so, we probably can't fit all that
 361 stuff in one fixnum without squeezing a lot, so we might as well split and
 362 record both SCs.
 363
 364 In a localized packing scheme, we would have to dump a different var-info
 365 whenever either the main location or the save location changes.  As a practical
 366 matter, the save location is less likely to change than the main location, and
 367 should never change without the main location changing.
 368
 369 One can conceive of localized packing schemes that do saving as a special case
 370 of localized packing.  If we did this, then the concept of a save location
 371 might be eliminated, but this would require major changes in the IR2
 372 representation for call and/or lifetime info.  Probably we will want saving to
 373 continue to be somewhat magical.]
 374
 375
 376 How about:
 377
 378 (defstruct var-info
 379   ;;
 380   ;; This variable's name. (symbol-name of the symbol)
 381   (name nil :type simple-string)
 382   ;;
 383   ;; The SC, ID and offset, encoded as bit-fields.
 384   (bits nil :type fixnum)
 385   ;;
 386   ;; The set of blocks this variable is live in.  If a bit-vector, then it has
 387   ;; a 1 when indexed by the number of a block that it is live in.  If an
 388   ;; I-vector, then it lists the live block numbers.  If a fixnum, then that is
 389   ;; the number of the sole live block.
 390   (lifetime nil :type (or vector fixnum))
 391   ;;
 392   ;; The variable's type, represented as list-style type descriptor.
 393   type)
 394
 395 Then the debug-info holds a simple-vector of all the var-info structures for
 396 that component.  We might as well make it sorted alphabetically by name, so
 397 that we can binary-search to find the variable corresponding to a particular
 398 name.
 399
 400 We need to be able to translate PCs to block numbers.  This can be done by an
 401 I-Vector in the component that contains the start location of each block.  The
 402 block number is the index at which we find the correct PC range.  This requires
 403 that we use an emit-order block numbering distinct from the IR2-Block-Number,
 404 but that isn't any big deal.  This seems space-expensive, but it isn't too bad,
 405 since it would only be a fraction of the code size if the average block length
 406 is a few words or more.
 407
 408 An advantage of our per-block lifetime representation is that it directly
 409 supports keeping a variable in different locations when in different blocks,
 410 i.e. multi-location packing.  We use a different var-info for each different
 411 packing, since the SC and offset are potentially different.  The Name and ID
 412 are the same, representing the fact that it is the same variable.  It is here
 413 that the ID is most significant, since the debugger could otherwise make
 414 same-name variables unique all by itself.
 415
 416
 417
 418 Stack parsing:
 419
 420 [\#\#\# Probably not worth trying to make the stack parseable from the bottom up.
 421 There are too many complications when we start having variable sized stuff on
 422 the stack.  It seems more profitable to work on making top-down parsing robust.
 423 Since we are now planning to wire the bottom-up linkage info, scanning from the
 424 bottom to find the top frame shouldn't be too inefficient, even when there was
 425 a runaway recursion.  If we somehow jump into hyperspace, then the debugger may
 426 get confused, but we can debug this sort of low-level system lossage using
 427 ADB.]
 428
 429
 430 There are currently three relevant context pointers:
 431   -- The PC.  The current PC is wired (implicit in the machine).  A saved
 432      PC (RETURN-PC) may be anywhere in the current frame.
 433   -- The current stack context (CONT).  The current CONT is wired.  A saved
 434      CONT (OLD-CONT) may be anywhere in the current frame.
 435   -- The current code object (ENV).  The current ENV is wired.  When saved,
 436      this is extra-difficult to locate, since it is saved by the caller, and is
 437      thus at an unknown offset in OLD-CONT, rather than anywhere in the current
 438      frame.
 439
 440 We must have all of these to parse the stack.
 441
 442 With the proposed Debug-Function, we parse the stack (starting at the top) like
 443 this:
 444  1] Use ENV to locate the current Debug-Info
 445  2] Use the Debug-Info and PC to determine the current Debug-Function.
 446  3] Use the Debug-Function to find the OLD-CONT and RETURN-PC.
 447  4] Find the old ENV by searching up the stack for a saved code object
 448     containing the RETURN-PC.
 449  5] Assign old ENV to ENV, OLD-CONT to CONT, RETURN-PC to PC and goto 1.
 450
 451 If we changed the function representation so that the code and environment were
 452 a single object, then the location of the old ENV would be simplified.  But we
 453 still need to represent ENV as separate from PC, since interrupts and errors
 454 can happen when the current PC isn't positioned at a valid return PC.
 455
 456 It seems like it might be a good idea to save OLD-CONT, RETURN-PC and ENV at
 457 the beginning of the frame (before any stack arguments).  Then we wouldn't have
 458 to search to locate ENV, and we also have a hope of parsing the stack even if
 459 it is damaged.  As long as we can locate the start of some frame, we can trace
 460 the stack above that frame.  We can recognize a probable frame start by
 461 scanning the stack for a code object (presumably a saved ENV).
 462
 463  Probably we want some fairly general
 464 mechanism for specifying that a TN should be considered to be live for the
 465 duration of a specified environment.  It would be somewhat easier to specify
 466 that the TN is live for all time, but this would become very space-inefficient
 467 in large block compilations.
 468
 469 This mechanism could be quite useful for other debugger-related things.  For
 470 example, when debuggability is important, we could make the TNs holding
 471 arguments live for the entire environment.  This would guarantee that a
 472 backtrace would always get the right value (modulo setqs).
 473
 474 Note that in this context, "environment" means the Environment structure (one
 475 per non-let function).  At least according to current plans, even when we do
 476 inter-routine register allocation, the different functions will have different
 477 environments: we just "equate" the environments.  So the number of live
 478 per-environment TNs is bounded by the size of a "function", and doesn't blow up
 479 in block compilation.
 480
 481 The implementation is simple: per-environment TNs are flagged by the
 482 :Environment kind.  :Environment TNs are treated the same as :Normal TNs by
 483 everyone except for lifetime/conflict analysis.  An environment's TNs are also
 484 stashed in a list in the IR2-Environment structure.  During during the conflict
 485 analysis post-pass, we look at each block's environment, and make all the
 486 environment's TNs always-live in that block.
 487
 488 We can implement the "fixed save location" concept needed for lazy frame
 489 creation by allocating the save TNs as wired TNs at IR2 conversion time.  We
 490 would use the new "environment lifetime" concept to specify the lifetimes of
 491 the save locations.  There isn't any run-time overhead if we never get around
 492 to using the save TNs.  [Pack would also have to notice TNs with pre-allocated
 493 save TNs, packing the original TN in the stack location if its FSC is the
 494 stack.]
 495
 496
 497 We want a standard (recognizable) format for an "escape" frame.  We must make
 498 an escape frame whenever we start running another function without the current
 499 function getting a chance to save its registers.  This may be due either to a
 500 truly asynchronous event such as a software interrupt, or due to an "escape"
 501 from a miscop.  An escape frame marks a brief conversion to a callee-saves
 502 convention.
 503
 504 Whenever a miscop saves registers, it should make an escape frame.  This
 505 ensures that the "current" register contents can always be located by the
 506 debugger.  In this case, it may be desirable to be able to indicate that only
 507 partial saving has been done.  For example, we don't want to have to save all
 508 the FP registers just so that we can use a couple extra general registers.
 509
 510 When when the debugger see an escape frame, it knows that register values are
 511 located in the escape frame's "register save" area, rather than in the normal
 512 save locations.
 513
 514 It would be nice if there was a better solution to this internal error concept.
 515 One problem is that it seems there is a substantial space penalty for emitting
 516 all that error code, especially now that we don't share error code between
 517 errors because we want to preserve the source context in the PC.  But this
 518 probably isn't really all that bad when considered as a fraction of the code.
 519 For example, the check part of a type check is 12 bytes, whereas the error part
 520 is usually only 6.  In this case, we could never reduce the space overhead for
 521 type checks by more than 1/3, thus the total code size reduction would be
 522 small.  This will be made even less important when we do type check
 523 optimizations to reduce the number of type checks.
 524
 525 Probably we should stick to the same general internal error mechanism, but make
 526 it interact with the debugger better by allocating linkage registers and
 527 allowing proceedable errors.  We could support shared error calls and
 528 non-proceedable errors when space is more important than debuggability, but
 529 this is probably more complexity than is worthwhile.
 530
 531 We jump or trap to a routine that saves the context (allocating at most the
 532 return PC register).  We then encode the error and context in the code
 533 immediately following the jump/trap.  (On the MIPS, the error code can be
 534 encoded in the trap itself.)  The error arguments would be encoded as
 535 SC-offsets relative to the saved context.  This could solve both the
 536 arg-trashing problem and save space, since we could encode the SC-offsets more
 537 tersely than the corresponding move instructions.