0.8.0.78.vector-nil-string.1:
As noted with horror between myself and pfdietz on #lisp,
vectors specialized on NIL are strings.
This patch implements (VECTOR NIL) as subtype of STRING with no
regressions in either our regression test suite or pfdietz' test suite.
However, this notwithstanding, there are a number of issues that need to
be resolved before this hits HEAD. (Why would it hit HEAD, you ask?
Well, it /is/ an ANSI issue, but in this case that would probably just
merit it an entry in BUGS, were it not for the fact that a Unicode
implementation is likely to have several string representations, so most
of the issues that we're addressing here will have to be dealt with in
any case; the use of (ARRAY NIL) as a "poison pill" to investigate
string routines and the like is probably a good thing. Note that this
is only a half-way house; while STRING is no longer the same type as
BASE-STRING, which is one portion of the Unicode battle, CHARACTER
remains equivalent to BASE-CHAR).
Brokennesses:
* STRING= and similar functions may work by accident for (VECTOR NIL 0),
but they're unlikely to be robustly working;
* FFI and ALIEN: we need at the very least (a) to ensure that C-STRINGs
get turned into a useful string type, not (VECTOR NIL) and (b) to
install a conversion routine for the other direction, so that the Lisp
string #.(make-array 0 :element-type nil) becomes the C string "";
* Filesystem access and SB-UNIX is completely unaudited. This may be
similar to the above issue;
* SXHASH-SIMPLE-STRING tries to access string elements, and promptly
errors on a (VECTOR NIL) with non-zero length. This also breaks
TYPE-OF;
* INTERN currently takes only a BASE-STRING;
* [ probably others. Should examine Brian Spilsbury's Unicode patch for
some more gotchas. ]
Suboptimalities:
* 10% slowdown in self-compilation, probably mostly caused by
CONCATENATE (not transformed away for general SIMPLE-STRINGs any more)
and HAIRY-DATA-VECTOR-{REF,SET} (type dispatch unavoidable for the
latter on STRING-typed objects). We can mitigate the latter issue by,
for STRINGlike types including (VECTOR NIL), having a vector nil type
test branching to an array-nil-accessed error clause if true, then
retrying the hairy-data-vector optimization;
* throughout the codebase, string and base-string have been
interchangeably used for a number of years; we need to look at them
all and fix them if necessary.
36 files changed: