Copyright (C) 2022-2026 Andrea Monaco

Copying and distribution of this file, with or without modification, are
permitted in any medium without royalty provided the copyright notice and this
notice are preserved.  This file is offered as-is, without any warranty.




Some notes on design choices, portability and ANSI conformance of alisp.



__Index__

- Floating point arithmetic
- Type checking
- Stack overflows
- Builtin functions and macros
- alisp extensions
- Literal objects
- Backquote notation
- Setting an undefined variable
- Arrays
- Structure objects and classes
- Condition objects and classes
- Pathnames
- File operations
- Streams
- Language of implementation
- Loading cl.lisp
- Using ASDF
- Number bases
- Character encoding
- Garbage collection
- The ROOM function
- Profiling
- Environment objects
- Compilation
- Compiler macros
- Error reporting
- Debugging
- Watchpoints
- Known bugs and limitations
- Policy on use of Large Language Models
- Minor details



__Floating point arithmetic__

The four floating-point types of ANSI CL are all the same type in alisp and map
to native C double.  This is allowed by ANSI and was done as a quick
implementation strategy.  Of course, it may change in the future.



__Type checking__

Arguments to standard functions and macros (or result of evaluating arguments to
macros, when applicable) are all type checked.

Values assigned to standard variables (like *PACKAGE*, *READ-BASE* and so on)
are not type checked, so assigning an object of the wrong type will likely crash
the interpreter.

Currently, these behaviors can't be changed.



__Stack overflows__

As a protection against stack overflows in lisp code, al has a limit on stack
depth set by the LISP_STACK_SIZE macro in C.  If that limit is reached, the
interpreter will signal a condition of type AL:MAXIMUM-STACK-DEPTH-EXCEEDED.



__Builtin functions and macros__

Some standard macros are implemented as special operators in al, i.e. they are
executed directly by the interpreter instead of being expanded; this is allowed
by ANSI, but the standard also requires a true macro implementation that is not
provided currently.  Note that SPECIAL-OPERATOR-P returns T for these special
operators.

al also provides additional non-standard special operators.  I think that this
is forbidden by the standard, understandably so that code walkers can rely on a
full knowledge of the language.  Such special operators will be removed in due
time.

You can freely redefine standard functions and macros.  If you redefine a
builtin with your lisp code though, there's no way to get the C definition back.
Be careful, you can easily make your image unusable this way.



__alisp extensions__

al has many non-standard builtins and variables; their names are exported from
the package ALISP, nicknamed AL; note that you cannot USE this package in
CL-USER due to some conflicts.
There are also some non-standard extensions of standard functions.

- *ARGC* (an integer) and *ARGV* (a vector of strings) let you access argc and
  argv of C, respectively, so they contain the number and value of command-line
  arguments passed to the interpreter respectively

- GETENV takes a string and returns the value of the environment variable with
  that name, or NIL if such does not exist

- SYSTEM takes a string or NIL; in the first case it tries executing that
  command, in the second it tells whether an execution environment is actually
  available.  SYSTEM uses the 'system' function from C, so the precise working
  depends on your C environment

- the EXIT function takes an optional integer (defaulting to 0) and exits the
  interpreter with that return value

- the function LIST-DIRECTORY takes a single pathname designator and returns a
  list with the filenames (without any leading path) contained in that directory

- the function DIRECTORYP takes a single pathname designator, it queries the
  filesystem and returns T if the path designates a directory, NIL otherwise

- the function GETCWD takes no arguments and returns the current working
  directory as a slash-terminated string

- the functions PRINT-NO-WARRANTY and PRINT-TERMS-AND-CONDITIONS print legal
  information

- the function STRING-INPUT-STREAM-STRING takes an input string stream and
  returns a string with the characters left to read

- the standard function MAKE-STRING-OUTPUT-STREAM takes an optional string
  argument; if provided, writing operations append to the string, also
  respecting the fill pointer if present

- the variable *COMPILE-WHEN-DEFINING* is described in "Compilation"

- BREAK is a condition class that represents encountering a breakpoint

- the *ENABLE-BREAKPOINTS* parameter lets you enable or disable breakpoints
  globally.  It defaults to T

- when you enter the debugger due to a condition, *DEBUGGING-CONDITION* is bound
  dinamically to that condition object; if you entered the debugger due to
  stepping, this variable is NIL

- the functions WATCH and UNWATCH control watchpoints, see "Watchpoints"

- the variable *PPRINT-DEPTH* is used by the pretty printer and contains the
  current level of indentation

- when the variable *PRINT-ALWAYS-TWO-COLONS* is non-nil, the printer always
  prefixes symbols with two colons when it prints the package name.  This is
  used by the file compiler

- the function PRINT-RESTARTS prints the available restarts and returns T

- the function PRINT-BACKTRACE takes an optional argument and prints the current
  backtrace of all called functions (including macros) and their arguments.  If
  the argument is non-nil, the function is verbose, meaning it also prints
  special forms and builtin macros as frames.  The function always returns T

- the function LIST-BACKTRACE takes an optional argument and returns a list with
  the argument lists of each call in the backtrace, in the same order as
  PRINT-BACKTRACE.  The argument has the same effect as that of PRINT-BACKTRACE

- the function DUMP-BINDINGS takes no arguments and returns a list with all the
  non-global live variable bindings, starting with the most recently
  established.  Each binding is a list containing the symbol, the value, the
  type of the binding (:LEXICAL or :SPECIAL); for lexical bindings, there's also
  a fourth element of T if the binding is in scope at that moment.  This
  function is intended to be called in the debugger, but works everywhere

- the function DUMP-FUNCTION-BINDINGS does the same thing as DUMP-BINDINGS, but
  with local function bindings

- the function DUMP-CAPTURED-ENV takes a function object and returns a list with
  all the lexical variable bindings that the function closed over.  The format
  of the list is the same as DUMP-BINDINGS, without the fourth field since it
  doesn't apply

- the function DUMP-METHODS takes a generic function and returns a list of its
  method objects

- the standard function FUNCTION-LAMBDA-EXPRESSION also accepts a method object.
  In that case, it returns the body of that method as a list

- each function object carries a name field for clarity and debugging purposes.
  The function FUNCTION-NAME takes a function and returns its name (or NIL for
  anonymous functions), while (SETF FUNCTION-NAME) lets you change the name

- the functions FUNCTION-BODY and (SETF FUNCTION-BODY) let you inspect and
  change the body of a function or method object

- each function or method object carries a set of attributes expressed as a list
  of keywords.  The functions FUNCTION-ATTRIBUTES and (SETF FUNCTION-ATTRIBUTES)
  let you inspect and change such list.  Currently the only recognised attribute
  is :COMPILED

- the function DUMP-FIELDS takes either a structure object, a standard object or
  a standard class and dumps the slots of that object or class as a list.  Each
  slot of an object is represented as a symbol with its name, if it is unbound,
  or as a name-value pair, if it is bound; for classes, only the name is present

- the function CLASS-PRECEDENCE-LIST takes a standard class object (not for a
  structure class) and returns the class precedence list of that class, as a
  list of class objects

- the special operators LOOPY-DESTRUCTURING-BIND and LOOPY-SETQ are similar to
  DESTRUCTURING-BIND and SETQ respectively, but they allow the kind of
  destructuring that LOOP uses, which has more lax rules than
  DESTRUCTURING-BIND.  In particular, you can provide more or less elements in
  the template than values, and you can put a NIL in the template to ignore the
  corresponding subtree

- the functions START-PROFILING, STOP-PROFILING, CLEAR-PROFILING and
  REPORT-PROFILING govern the profiler, see "Profiling".


alisp has the following non-standard types:

- FUNCTION-NAME is a function name as defined by ANSI, that is a symbol or a
  list made by CL:SETF and a symbol

- COMPILED-METHOD is a subtype of METHOD and represents a compiled method

- the types BACKQUOTE, COMMA, AT and DOT represent the respective elements in
  backquote notation.  You can call the NEXT function to reach the next element
  when traversing them


alisp defines these non-standard conditions:

- MAXIMUM-STACK-DEPTH-EXCEEDED derives from PROGRAM-ERROR and is raised when the
  stack depth of lisp code exceeds the limit.  The MAX-DEPTH slot holds that
  limit

- WRONG-NUMBER-OF-ARGUMENTS derives from PROGRAM-ERROR and is raised when a
  function gets the wrong number of arguments.  The minimum and maximum number
  of accepted arguments are in the slots MIN-ARGS and MAX-ARGS respectively

- UNKNOWN-KEYWORD-ARGUMENT derives from PROGRAM-ERROR and is raised when a
  function gets an unknown keyword argument

- ODD-NUMBER-OF-ARGUMENTS-IN-KEYWORD-PART-OF-FORM derives from PROGRAM-ERROR and
  is raised when a function with keyword arguments receives an odd number of
  arguments in the keyword part of the form

- INVALID-FORM derives from PROGRAM-ERROR and is raised when you attempt
  evaluating an invalid form; the form itself is stored in the FORM slot



__Literal objects__

In alisp, the reader always produces fresh objects.  Therefore modifying
literals, albeit undefined by ANSI, works as expected.

Of course, such undefined behavior should be avoided if you want best
portability.



__Backquote notation__

When evaluating a backquoted expression, alisp keeps as much as possible of the
source list structure.  Consider for example this form:

 `(0 1 ,(gensym) 2 3)

Each time you evaluate it, the first three conses of the result will get
allocated fresh, but the third cdr will point to the fourth cons of the source
form.

This may have unexpected results: if you later modify the fourth cons of the
returned list, you will modify the backquote form itself.

If you want to ensure that each evaluation produces an entirely fresh list
structure, you can prefix a comma and a tick to the last car, like this:

 `(0 1 ,(gensym) 2 ,'3)



__Setting an undefined variable__

Doing SETQ or SETF on an undefined variable is undefined in ANSI, but it is
accepted by many implementations.  In al, this causes the variable to be set
globally and proclaimed special, so it is equivalent to a DEFPARAMETER.



__Arrays__

All arrays are adjustable in alisp.  As long as it's an interpreter, there's no
reason to do otherwise.



__Structure objects and classes__

Structure constructors fill with NIL those slots that are not initialized with
keyword arguments and that don't have an initialization form.

The TYPE and READ-ONLY options for slots have no effect.

Redefining a structure type works fine, despite being undefined in ANSI.  If you
redefine a structure type, the existing structures of that type will stick to
their original definition.



__Condition objects and classes__

Condition objects and classes are fully implemented as CLOS objects and classes
respectively.



__Pathnames__

I don't like the filename API of Common Lisp very much.  I think it tries to be
so abstract as to accomodate every conceivable filesystem, while at the same
time giving so much leeway to the implementers that you can assume very little
about each implementation.

The syntax is also puzzling: if you want to represent the file "/home/foo/",
then why representing it as (:ABSOLUTE "home" "foo")?  The former is a simple
and recognizable string, while the second means allocating three conses, a
symbol and two strings, which is quite inefficient.
The standard seemingly implies that the second syntax is better because it is
independent of path separator characters, but is that so?  If you port your
program to different or exotic systems, important files will likely be in
totally different places, so separator characters will be the least of your
concerns.

I'd go as far as recommend to represent filenames as plain strings, avoiding
pathname objects entirely.  (I think that all pathname functions also accept
plain strings.)
(See also the file WHY-NO-PATHNAMES).

That said, alisp represents a pathname object as a string internally; you can
access the underlying string with NAMESTRING.

If you really need to access the "components" of a path, those are extracted
according to the following syntax: in "/home/foo/bar.baz", the directory is
"/home/foo/", the name is "bar" and the type is "baz".

If you specify :WILD as a path component, that component becomes a single
asterisk.  The value :WILD-INFERIORS becomes two asterisks, but most Posix
shells don't interpret those in a special way.  The value :UNSPECIFIC is never
allowed in any component of a pathname.

The truename of a file is just the file path, there's no resolution.

USER-HOMEDIR-PATHNAME tries reading the HOME environment variable, and nothing
else.

If you need to support another system than Posix, you have to change a few
things, but it's not hard.



__File operations__

PROBE-FILE does not work very well.  It tries to open the file for reading and
returns NIL if it can't, so it may return NIL even just for lack of permissions.
OPEN uses the same approach to determine if the file exists.  In the future I
will probably add an optional use of POSIX api for better file operations.



__Streams__

No stream is deemed interactive in alisp.



__Language of implementation__

The alisp codebase should be valid C89, to the best of my knowledge.
Unfortunately, the whole alisp is not C89, since libgmp, which is a required
dependency, seemingly is not.  I will remove the dependency on libgmp at some
point.  I don't know about libreadline, but you can still build without it.



__Loading cl.lisp__

If you don't load cl.lisp, you still have a decent and self-sufficient subset of
Common Lisp.



__Using ASDF__

alisp ships with a modified version of ASDF that you can load.  Only
ASDF:LOAD-SYSTEM has been confirmed to work.



__Number bases__

*READ-BASE* works correctly, so you can read numbers in any base from 2 to 36.
*PRINT-BASE* instead only works with the bases 8, 10, 16, due to a limitation of
libgmp.



__Character encoding__

al stores strings as array of bytes, so it is agnostic to their encoding as long
as it recognizes basic ASCII characters like parenthesis, ticks, etc.  Better
support for Unicode will come in the future.



__Garbage collection__

For garbage collection, alisp uses the algorithm described in "A cyclic
reference counting algorithm and its proof" by Pepels, van Eekelen, Plasmeijer
(1988).  This is a kind of enhanced reference counting that also collects loops.
The paper contains a proof of termination and correctness.  I don't know of
other implementations using this, so this is somewhat experimental.

Constants defined with DEFCONSTANT are skipped when doing traversals of the
reference graph, so defining constants has true performance benefits.

Package objects are also skipped in reference counting.



__The ROOM function__

Calling ROOM shows the number of living objects of various types.  T means all
living objects; FUNCTION also includes macros.



__Profiling__

al has a basic profiler.  You can start profiling with START-PROFILING, which
introduces some overhead, and stop it with STOP-PROFILING.

If you later start profiling again, al will keep adding to the previous data;
calling CLEAR-PROFILING clears all data.

REPORT-PROFILING returns a list in which each element is a list which contains a
name, a counter of all the times that function or macro has been called, all the
time (in the same unit used by clock () of your C library, often microseconds)
spent in that function or macro, including the time spent in all descendants in
the call graph, and the average time spent per call, that is the ratio between
the second and first number.

For functions, the total time doesn't include evaluation of arguments.  For
macros, the time includes both expansion and evaluation of result.

Keep in mind that, when a function calls itself directly or indirectly, the time
spent in the inner invocation is counted twice, so the last number is often more
useful than the second one.



__Environment objects__

In alisp, environment objects are a list with a particular structure.  Since
environments are not used often and mostly at compile time, I didn't feel the
need for a more specialized and efficient representation.  As an added benefit,
using a list makes these objects very easy to inspect and modify.



__Compilation__

The variable AL:*COMPILE-WHEN-DEFINING* defaults to NIL; when it is non-nil,
DEFUN and DEFMACRO will compile the body of the function when they are
evaluated.  Also, each newly defined generic function will be marked as
compiled, meaning that each new method added to that function will get compiled.

You may want to keep the variable disabled for debugging, since stepping through
a macro-expanded function is less clear.

If a function is compiled, FUNCTION-LAMBDA-EXPRESSION will return the body of
the function macro-expanded.



__Compiler macros__

Definitions of compiler macros are stored in AL:*COMPILER-MACRO-REGISTRY*, an
EQUAL hash table indexed by macro name.

Compiler macros are expanded when compiling, only if AL:*EXPAND-COMPILER-MACROS*
is non-nil.



__Error reporting__

When I started writing alisp, the condition system was not in place, so I had to
devise a simpler scheme for reporting errors: each time the interpreter
encountered an abnormal situation, it would print an appropriate error message
and abort to top level.
This was not terribly useful, since you couldn't really handle those conditions
nor enter the debugger.
Then at some point I implemented a decent subset of the condition system, so I
started replacing the old system of reporting with the one that ANSI requires.
Many types of error situations now raise proper conditions; the old system is
used in some places, but I will gradually replace it.



__Debugging__

Stepping is always available in the debugger, no matter if you enter it with
BREAK or in any other way.  The following commands are available:

- N executes the next form and then breaks again

- X only steps over the macroexpansion of the next form, if it is a
  (non-builtin) unexpanded macro; otherwise it behaves like N

- S steps inside the next form and breaks; for function forms, it will first
  step in the argument forms; for unexpanded macro forms, it will first step in
  the macro expansion process, then in the resulting form

- C continues execution at normal speed

- E continues execution until the end of current function

- BT is equivalent to (AL:PRINT-BACKTRACE)

- H or ? display help.

When the debugger is entered due to stepping, the result of evaluating the last
form is displayed preceded by " -> ", then a blank line, then the next form is
showed.  If the next form is a non-builtin macro, then it is followed by
"(macro)".  If you input an empty line at the debugger prompt, the last form or
debugging command is executed again.



__Watchpoints__

You can watch standard objects and hash tables for modifications.  The function
WATCH takes a standard object or an hash table as argument and toggles watching
on it; it returns T if the object is one of those types, NIL otherwise.
UNWATCH untoggles watching on its argument.

When a watched object is modified in any field or a watched hash table has an
object added or removed or is cleared, the debugger is entered.



__Known bugs and limitations__

- Definition and changing of reader macros is not yet implemented

- Package locks and deletion of packages is still missing

- Displaced arrays are not yet implemented

- Meeting an unexpected dotted list in some places will cause a crash

- The 'short form' of DEFSETF is not yet implemented



__Policy on use of Large Language Models__

I'm the only author of alisp and I never use Large Language Models, in
particular I don't use them for writing code or documentation or tests, nor for
finding or solving bugs.



__Minor details__

In standard functions that take both a :TEST and :TEST-NOT argument, the former
takes precedence.
