From: Olin Shivers <shivers@lambda.ai.mit.edu>
Subject: New library-search feature for scsh -- comments requested
Date: 1999/10/26
Message-ID: <qijg0yx3e5k.fsf@lambda.ai.mit.edu>
Organization: Artificial Intelligence Lab, MIT
Followup-To: comp.lang.scheme.scsh
Newsgroups: comp.lang.scheme.scsh,comp.lang.scheme
I'm adding a new feature to scsh to support scripting, and I'd really
appreciate feedback -- now is the time to get it right, before it's
released out into the world.
There's a long-standing problem with scripting in scsh that I'd like to fix
for scripts that need to load modules. In scsh, you can cause a script to load
in a file of code for a module by using the the
-lm <module-file>
switch on the command line (on the #! trigger line). But this requires you to
specify *too much* information, actually: you have to give the *complete* file
name, which is simply handed to LOAD. This turns out, in practice, to be a
real pain, because the system libraries may live in different directories
on different systems.
What you'd like to do, really, is say something like "look around for the
module file geometric-utils.scm, and load it in." That is, you'd like a
search path for the loader to go find libraries in some standard set of
library directories... where the definition of that "standard set" might
vary from installation to installation, or user to user.
Now, a really solid solution to this issue is going to need a real
redesign of the module loading system, which has to come from the S48
folks. But I am going to propose an interim fix for scsh users.
I am planning to add some new switches to the command-line parser.
-ll <module-file-name>
Load library module into config package.
This is just like the -lm switch, except that it searches the
library-directory path list for the file to load.
Specifically, it means: search through the LIBRARY-DIRECTORIES list of
directories looking for a module file of the given name, and load it
in.
The LIBRARY-DIRECTORIES list defaults to an installation-specific
value, which is typically
("/usr/local/lib/scsh/modules/")
If the environment variable $SCSH_LIB_DIRS is set, it is used
to determine the library search path. The value of this environment
variable treated as a sequence of s-expressions, which are "read"
from the string.
- A string is treated as a directory.
- #F is replaced with the default list
of directories.
A SCSH_LIB_DIRS assignment of this form
SCSH_LIB_DIRS='"." "/usr/contrib/lib/scsh/" #f "/home/shivers/lib/scsh"'
would produce this list of strings for the LIBRARY-DIRECTORIES list:
("." "/usr/contrib/lib/scsh/"
"/usr/local/lib/scsh/modules/"
"/home/shivers/lib/scsh")
Here is a sample bit of bash code that will add a subdir from
your home and project directories to the default list:
SCSH_LIB_DIRS="\"$HOME/lib/scsh\" \"/usr/project/lib/scsh\" #f"
Notice we had to backquote each string's double-quotes for bash.
Let's add the scsh process' current working directory to that list:
SCSH_LIB_DIRS="\".\" $SCSH_LIB_DIRS"
[One might consider allowing symbols as well as strings. The problem
is that this requires knowing how the reader deals with case when
reading symbols. Scsh's reader is case-preserving; R5RS's reader is
case-folding. I try to minimise scsh' *dependence* on case-preserving
symbol reading; the process notation is currently the only place where
it ever really matters.
The advantage is that setting the environment variable in bash
.profile or other init files becomes easier. The above example
becomes
SCSH_LIB_DIRS="$HOME/lib/scsh /usr/project/lib/scsh #f"
The backquoted-double-quotes go away, which is nicer.]
When searching for a directory containing a given library module,
nonexistent or read-protected directories are silently ignored;
it is not an error to have them in the LIBRARY-DIRECTORIES list.
It *is* a startup error if reading the $SCSH_LIB_DIRS env var causes
a read error, or produces a value that isn't a string or #f.
E.g., these are values of $SCSH_LIB_DIRS that will blow up scsh:
SCSH_LIB_DIRS="3.14 foo (3)" [Bogus values]
SCSH_LIB_DIRS="\"/usr/lib/scsh" [read error -- no close-quote]
SCSH_LIB_DIRS="." [read error -- illegal sexp]
Directory search can be recursive. A directory name that ends
with a slash is recursively searched. So this list
("/usr/local/lib/scsh/modules/" "/home/shivers/lib/scsh")
means
1. First search /usr/local/lib/scsh/modules and all its
subdirectories;
2. Then search /home/shivers/lib/scsh *non*-recursively.
Recursive search is depth-first, with directories ordered by the
Unix directory ordering, that is, the order in which ls(1) prints
out directories: ASCII alphabetical/lexicographical (which places
capitalised filenames before lower-case filenames), but dot files
sort before non-dot files.
+lp <lib-dir>
lp+ <lib-dir>
Add directory <lib-dir> to the beginning or end of the
LIBRARY-DIRECTORIES path list, respectively.
<lib-dir> is a *single* directory. It is not split at colons or
otherwise processed.
+lpe
lpe+
As above, except that ~ home-directory syntax and environment
variables are expanded out.
-lp-clear
-lp-default
Set the LIBRARY-DIRECTORIES path list to the empty list and the system
default, respectively.
These two switches are useful if you would like to protect
your script from influence by the $SCSH_LIB_PATH variable.
In these cases, the SCSH_LIB_PATH environment variable is never
even parsed, so a bogus value will not affect the script's
execution at all.
These switches and the $SCSH_LIB_PATH environment variable allow you to dump
useful modules into a central repository in, say /usr/local or your home
directory; your scripts can easily access them from these places.
As a design note, one might consider encoding the $SCSH_LIB_PATH variable in a
more tradtional Unix way, as a colon-delimited path list. There are some
problems with this approach, that would require us to "patch" the standard
Unix technique:
- Directories containing colons in them are not allowed -- which
is a problem, as /usr/foo:bar is a perfectly legal Unix file name.
We would need to make backslash a special quoting character to
handle this.
- Unix path lists usually use colon to *separate* file names,
not to *terminate* filenames. This is ambiguous in the empty-string
case. Is the empty string the empty list, or the singleton list
of the empty string -- i.e. does it parse as path list () or ("")?
We can patch this by going with a non-conventional colon-terminator
grammar for the path list encoding.
- We need a way to encode the system default path list.
We could adopt the TeX convention of using the empty string
to signify this. But this means we can't use empty string
as the root directory. We could either accept this limitation,
or change our slash-terminator-means-recursion convention to
a double-slash-terminator-means-recursion convention. Now we
can use "/" for the root directory, and empty string becomes
available.
With these three patches -- which give us a system rather different from
the standard (broken) Unix convention, we have something workable. This
strikes me as being neither fish nor fowl. If you are going to break with
convention, you might as well do something that makes it clear you have
departed from the standard method.
But I'm interested to hear people's opinions.
I am going to hack this into the development sources. As I said, I'd like to
hear from people what they think of the particulars of this design -- comments
or suggestions or criticisms would be greatly appreciated.
Thanks!
-Olin
Up |