Chapter 3

System Calls

Scsh provides (almost) complete access to the basic Unix kernel services: processes, files, signals and so forth. These procedures comprise a Scheme binding for POSIX, with a few of the more standard extensions thrown in (e.g., symbolic links, fchown, fstat, sockets).

3.1 Errors

Scsh syscalls never return error codes, and do not use a global errno variable to report errors. Errors are consistently reported by raising exceptions. This frees up the procedures to return useful values, and allows the programmer to assume that if a syscall returns, it succeeded. This greatly simplifies the flow of the code from the programmer's point of view.

Since Scheme does not yet have a standard exception system, the scsh definition remains somewhat vague on the actual form of exceptions and exception handlers. When a standard exception system is defined, scsh will move to it. For now, scsh uses the Scheme 48 exception system, with a simple sugaring on top to hide the details in the common case.

System call error exceptions contain the Unix errno code reported by the system call. Unlike C, the errno value is a part of the exception packet, it is not accessed through a global variable.

For reference purposes, the Unix errno numbers are bound to the variables errno/perm, errno/noent, etc. System calls never return error/intr -- they automatically retry.

(errno-error errno syscall . data) ---> no return value (procedure)

Raises a Unix error exception for Unix error number errno. The syscall and data arguments are packaged up in the exception packet passed to the exception handler.

(with-errno-handler* handler thunk) ---> value(s) of thunk (procedure)

(with-errno-handler handler-spec . body) ---> value of body (syntax)

Unix syscalls raise error exceptions by calling errno-error. Programs can use with-errno-handler* to establish handlers for these exceptions.
If a Unix error arises while thunk is executing, handler is called on two arguments like this:

(handler errno packet)
packet is a list of the form
packet = (errno-msg syscall . data),
where errno-msg is the standard Unix error message for the error, syscall is the procedure that generated the error, and data is a list of information generated by the error, which varies from syscall to syscall.

If handler returns, the handler search continues upwards. Handler can acquire the exception by invoking a saved continuation. This procedure can be sugared over with the following syntax:

(with-errno-handler ((errno packet) clause ...) body1 body2 ...)
This form executes the body forms with a particular errno handler installed. When an errno error is raised, the handler search machinery will bind variable errno to the error's integer code, and variable packet to the error's auxiliary data packet. Then, the clauses will be checked for a match. The first clause that matches is executed, and its value is the value of the entire with-errno-handler form. If no clause matches, the handler search continues.
Error clauses have two forms

((errno ...) body ...) (else body ...)
In the first type of clause, the errno forms are integer expressions. They are evaluated and compared to the error's errno value. An else clause matches any errno value. Note that the errno and data variables are lexically visible to the error clauses.
Example:

(with-errno-handler ((errno packet) ; Only handle 3 particular errors. ((errno/wouldblock errno/again) (loop)) ((errno/acces) (format #t "Not allowed access!") #f)) (foo frobbotz) (blatz garglemumph))
It is not defined what dynamic context the handler executes in, so fluid variables cannot reliably be referenced.
Note that Scsh system calls always retry when interrupted, so that the errno/intr exception is never raised. If the programmer wishes to abort a system call on an interrupt, he should have the interrupt handler explicitly raise an exception or invoke a stored continuation to throw out of the system call.

3.1.1 Interactive mode and error handling

Scsh runs in two modes: interactive and script mode. It starts up in interactive mode if the scsh interpreter is started up with no script argument. Otherwise, scsh starts up in script mode. The mode determines whether scsh prints prompts in between reading and evaluating forms, and it affects the default error handler. In interactive mode, the default error handler will report the error, and generate an interactive breakpoint so that the user can interact with the system to examine, fix, or dismiss from the error. In script mode, the default error handler causes the scsh process to exit.

When scsh forks a child with (fork), the child resets to script mode. This can be overridden if the programmer wishes.

3.2 I/O

3.2.1 Standard R5RS I/O procedures

In scsh, most standard R5RS I/O operations (such as display or read-char) work on both integer file descriptors and Scheme ports. When doing I/O with a file descriptor, the I/O operation is done directly on the file, bypassing any buffered data that may have accumulated in an associated port. Note that character-at-a-time operations such as read-char are likely to be quite slow when performed directly upon file descriptors.

The standard R5RS procedures read-char, char-ready?, write, display, newline, and write-char are all generic, accepting integer file descriptor arguments as well as ports. Scsh also mandates the availability of format, and further requires format to accept file descriptor arguments as well as ports.

The procedures peek-char and read do not accept file descriptor arguments, since these functions require the ability to read ahead in the input stream, a feature not supported by Unix I/O.

3.2.2 Port manipulation and standard ports

(close-after port consumer) ---> value(s) of consumer (procedure)

Returns (consumer port), but closes the port on return. No dynamic-wind magic.
Remark: Is there a less-awkward name?

(error-output-port) ---> port (procedure)

This procedure is analogous to current-output-port, but produces a port used for error messages -- the scsh equivalent of stderr.

(with-current-input-port* port thunk) ---> value(s) of thunk (procedure)

(with-current-output-port* port thunk) ---> value(s) of thunk (procedure)

(with-error-output-port* port thunk) ---> value(s) of thunk (procedure)

These procedures install port as the current input, current output, and error output port, respectively, for the duration of a call to thunk.

(with-current-input-port port . body) ---> value(s) of body (syntax)

(with-current-output-port port . body) ---> value(s) of body (syntax)

(with-error-output-port port . body) ---> value(s) of body (syntax)

These special forms are simply syntactic sugar for the with-current-input-port* procedure and friends.

(set-current-input-port! port) ---> undefined (procedure)

(set-current-output-port! port) ---> undefined (procedure)

(set-error-output-port! port) ---> undefined (procedure)

These procedures alter the dynamic binding of the current I/O port procedures to new values.

(close fd/port) ---> boolean (procedure)

Close the port or file descriptor.
If fd/port is a file descriptor, and it has a port allocated to it, the port is shifted to a new file descriptor created with (dup fd/port) before closing fd/port. The port then has its revealed count set to zero. This reflects the design criteria that ports are not associated with file descriptors, but with open files.

To close a file descriptor, and any associated port it might have, you must instead say one of (as appropriate):

(close (fdes->inport fd)) (close (fdes->outport fd))
The procedure returns true if it closed an open port. If the port was already closed, it returns false; this is not an error.

(stdports->stdio) ---> undefined (procedure)

(stdio->stdports) ---> undefined (procedure)

These two procedures are used to synchronise Unix' standard I/O file descriptors and Scheme's current I/O ports.
(stdports->stdio) causes the standard I/O file descriptors (0, 1, and 2) to take their values from the current I/O ports. It is exactly equivalent to the series of redirections:³

(dup (current-input-port) 0) (dup (current-output-port) 1) (dup (error-output-port) 2)
stdio->stdports causes the bindings of the current I/O ports to be changed to ports constructed over the standard I/O file descriptors. It is exactly equivalent to the series of assignments
(set-current-input-port! (fdes->inport 0)) (set-current-output-port! (fdes->outport 1)) (set-error-output-port! (fdes->outport 2))
However, you are more likely to find the dynamic-extent variant, with-stdio-ports*, below, to be of use in general programming.

(with-stdio-ports* thunk) ---> value(s) of thunk (procedure)

(with-stdio-ports body ...) ---> value(s) of body (syntax)

with-stdio-ports* binds the standard ports (current-input-port), (current-output-port), and (error-output-port) to be ports on file descriptors 0, 1, 2, and then calls thunk. It is equivalent to:
(with-current-input-port (fdes->inport 0) (with-current-output-port (fdes->inport 1) (with-error-output-port (fdes->outport 2) (thunk))))
The with-stdio-ports special form is merely syntactic sugar.

3.2.3 String ports

Scheme 48 has string ports, which you can use. Scsh has not committed to the particular interface or names that Scheme 48 uses, so be warned that the interface described herein may be liable to change.

(make-string-input-port string) ---> port (procedure)

Returns a port that reads characters from the supplied string.

(make-string-output-port) ---> port (procedure)

(string-output-port-output port) ---> string (procedure)

A string output port is a port that collects the characters given to it into a string. The accumulated string is retrieved by applying string-output-port-output to the port.

(call-with-string-output-port procedure) ---> string (procedure)

The procedure value is called on a port. When it returns, call-with-string-output-port returns a string containing the characters that were written to that port during the execution of procedure.

3.2.4 Revealed ports and file descriptors

The material in this section and the following one is not critical for most applications. You may safely skim or completely skip this section on a first reading.

Dealing with Unix file descriptors in a Scheme environment is difficult. In Unix, open files are part of the process environment, and are referenced by small integers called file descriptors. Open file descriptors are the fundamental way I/O redirections are passed to subprocesses, since file descriptors are preserved across fork's and exec's.

Scheme, on the other hand, uses ports for specifying I/O sources. Ports are garbage-collected Scheme objects, not integers. Ports can be garbage collected; when a port is collected, it is also closed. Because file descriptors are just integers, it's impossible to garbage collect them -- you wouldn't be able to close file descriptor 3 unless there were no 3's in the system, and you could further prove that your program would never again compute a 3. This is difficult at best.

If a Scheme program only used Scheme ports, and never actually used file descriptors, this would not be a problem. But Scheme code must descend to the file descriptor level in at least two circumstances:

when interfacing to foreign code
when interfacing to a subprocess.

This causes a problem. Suppose we have a Scheme port constructed on top of file descriptor 2. We intend to fork off a program that will inherit this file descriptor. If we drop references to the port, the garbage collector may prematurely close file 2 before we fork the subprocess. The interface described below is intended to fix this and other problems arising from the mismatch between ports and file descriptors.

The Scheme kernel maintains a port table that maps a file descriptor to the Scheme port allocated for it (or, #f if there is no port allocated for this file descriptor). This is used to ensure that there is at most one open port for each open file descriptor.

The port data structure for file ports has two fields besides the descriptor: revealed and closed?. When a file port is closed with (close port), the port's file descriptor is closed, its entry in the port table is cleared, and the port's closed? field is set to true.

When a file descriptor is closed with (close fdes), any associated port is shifted to a new file descriptor created with (dup fdes). The port has its revealed count reset to zero (and hence becomes eligible for closing on GC). See discussion below. To really put a stake through a descriptor's heart without waiting for associated ports to be GC'd, you must say one of

(close (fdes->inport fdes)) (close (fdes->output fdes))

The revealed field is an aid to garbage collection. It is an integer semaphore. If it is zero, the port's file descriptor can be closed when the port is collected. Essentially, the revealed field reflects whether or not the port's file descriptor has escaped to the Scheme user. If the Scheme user doesn't know what file descriptor is associated with a given port, then he can't possibly retain an ``integer handle'' on the port after dropping pointers to the port itself, so the garbage collector is free to close the file.

Ports allocated with open-output-file and open-input-file are unrevealed ports -- i.e., revealed is initialised to 0. No one knows the port's file descriptor, so the file descriptor can be closed when the port is collected.

The functions fdes->output-port, fdes->input-port, port->fdes are used to shift back and forth between file descriptors and ports. When port->fdes reveals a port's file descriptor, it increments the port's revealed field. When the user is through with the file descriptor, he can call (release-port-handle port), which decrements the count. The function (call/fdes fd/port proc) automates this protocol. call/fdes uses dynamic-wind to enforce the protocol. If proc throws out of the call/fdes application, the unwind handler releases the descriptor handle; if the user subsequently tries to throw back into proc's context, the wind handler raises an error. When the user maps a file descriptor to a port with fdes->outport or fdes->inport, the port has its revealed field incremented.

Not all file descriptors are created by requests to make ports. Some are inherited on process invocation via exec(2), and are simply part of the global environment. Subprocesses may depend upon them, so if a port is later allocated for these file descriptors, is should be considered as a revealed port. For example, when the Scheme shell's process starts up, it opens ports on file descriptors 0, 1, and 2 for the initial values of (current-input-port), (current-output-port), and (error-output-port). These ports are initialised with revealed set to 1, so that stdin, stdout, and stderr are not closed even if the user drops the port.

Unrevealed file ports have the nice property that they can be closed when all pointers to the port are dropped. This can happen during gc, or at an exec() -- since all memory is dropped at an exec(). No one knows the file descriptor associated with the port, so the exec'd process certainly can't refer to it.

This facility preserves the transparent close-on-collect property for file ports that are used in straightforward ways, yet allows access to the underlying Unix substrate without interference from the garbage collector. This is critical, since shell programming absolutely requires access to the Unix file descriptors, as their numerical values are a critical part of the process interface.

A port's underlying file descriptor can be shifted around with dup(2) when convenient. That is, the actual file descriptor on top of which a port is constructed can be shifted around underneath the port by the scsh kernel when necessary. This is important, because when the user is setting up file descriptors prior to a exec(2), he may explicitly use a file descriptor that has already been allocated to some port. In this case, the scsh kernel just shifts the port's file descriptor to some new location with dup, freeing up its old descriptor. This prevents errors from happening in the following scenario. Suppose we have a file open on port f. Now we want to run a program that reads input on file 0, writes output to file 1, errors to file 2, and logs execution information on file 3. We want to run this program with input from f. So we write:

(run (/usr/shivers/bin/prog) (> 1 output.txt) (> 2 error.log) (> 3 trace.log) (= 0 ,f))

Now, suppose by ill chance that, unbeknownst to us, when the operating system opened f's file, it allocated descriptor 3 for it. If we blindly redirect trace.log into file descriptor 3, we'll clobber f! However, the port-shuffling machinery saves us: when the run form tries to dup trace.log's file descriptor to 3, dup will notice that file descriptor 3 is already associated with an unrevealed port (i.e., f). So, it will first move f to some other file descriptor. This keeps f alive and well so that it can subsequently be dup'd into descriptor 0 for prog's stdin.

The port-shifting machinery makes the following guarantee: a port is only moved when the underlying file descriptor is closed, either by a close() or a dup2() operation. Otherwise a port/file-descriptor association is stable.

Under normal circumstances, all this machinery just works behind the scenes to keep things straightened out. The only time the user has to think about it is when he starts accessing file descriptors from ports, which he should almost never have to do. If a user starts asking what file descriptors have been allocated to what ports, he has to take responsibility for managing this information.

3.2.5 Port-mapping machinery

The procedures provided in this section are almost never needed. You may safely skim or completely skip this section on a first reading.

Here are the routines for manipulating ports in scsh. The important points to remember are:

A file port is associated with an open file, not a particular file descriptor.
The association between a file port and a particular file descriptor is never changed except when the file descriptor is explicitly closed. ``Closing'' includes being used as the target of a dup2, so the set of procedures below that close their targets are close, two-argument dup, and move->fdes. If the target file descriptor of one of these routines has an allocated port, the port will be shifted to another freshly-allocated file descriptor, and marked as unrevealed, thus preserving the port but freeing its old file descriptor.

These rules are what is necessary to ``make things work out'' with no surprises in the general case.

(fdes->inport fd) ---> port (procedure)

(fdes->outport fd) ---> port (procedure)

(port->fdes port) ---> fixnum (procedure)

These increment the port's revealed count.

(port-revealed port) ---> integer or #f (procedure)

Return the port's revealed count if positive, otherwise #f.

(release-port-handle port) ---> undefined (procedure)

Decrement the port's revealed count.

(call/fdes fd/port consumer) ---> value(s) of consumer (procedure)

Calls consumer on a file descriptor; takes care of revealed bookkeeping. If fd/port is a file descriptor, this is just (consumer fd/port). If fd/port is a port, calls consumer on its underlying file descriptor. While consumer is running, the port's revealed count is incremented.
When call/fdes is called with port argument, you are not allowed to throw into consumer with a stored continuation, as that would violate the revealed-count bookkeeping.

(move->fdes fd/port target-fd) ---> port or fdes (procedure)

Maps fd-->fd and port-->port.
If fd/port is a file-descriptor not equal to target-fd, dup it to target-fd and close it. Returns target-fd.

If fd/port is a port, it is shifted to target-fd, by duping its underlying file-descriptor if necessary. Fd/port's original file descriptor is closed (if it was different from target-fd). Returns the port. This operation resets fd/port's revealed count to 1.

In all cases when fd/port is actually shifted, if there is a port already using target-fd, it is first relocated to some other file descriptor.

3.2.6 Unix I/O

(dup fd/port [newfd]) ---> fd/port (procedure)

(dup->inport fd/port [newfd]) ---> port (procedure)

(dup->outport fd/port [newfd]) ---> port (procedure)

(dup->fdes fd/port [newfd]) ---> fd (procedure)

These procedures provide the functionality of C's dup() and dup2(). The different routines return different types of values: dup->inport, dup->outport, and dup->fdes return input ports, output ports, and integer file descriptors, respectively. dup's return value depends on on the type of fd/port -- it maps fd-->fd and port-->port.
These procedures use the Unix dup() syscall to replicate the file descriptor or file port fd/port. If a newfd file descriptor is given, it is used as the target of the dup operation, i.e., the operation is a dup2(). In this case, procedures that return a port (such as dup->inport) will return one with the revealed count set to one. For example, (dup (current-input-port) 5) produces a new port with underlying file descriptor 5, whose revealed count is 1. If newfd is not specified, then the operating system chooses the file descriptor, and any returned port is marked as unrevealed.

If the newfd target is given, and some port is already using that file descriptor, the port is first quietly shifted (with another dup) to some other file descriptor (zeroing its revealed count).

Since Scheme doesn't provide read/write ports, dup->inport and dup->outport can be useful for getting an output version of an input port, or vice versa. For example, if p is an input port open on a tty, and we would like to do output to that tty, we can simply use (dup->outport p) to produce an equivalent output port for the tty. Be sure to open the file with the open/read+write flag for this.

(seek fd/port offset [whence]) ---> integer (procedure)

Reposition the I/O cursor for a file descriptor or port. whence is one of {seek/set, seek/delta, seek/end}, and defaults to seek/set. If seek/set, then offset is an absolute index into the file; if seek/delta, then offset is a relative offset from the current I/O cursor; if seek/end, then offset is a relative offset from the end of file. The fd/port argument may be a port or an integer file descriptor. Not all such values are seekable; this is dependent on the OS implementation. The return value is the resulting position of the I/O cursor in the I/O stream.
Oops: The current implementation doesn't handle offset arguments that are not immediate integers (i.e., representable in 30 bits).

Oops: The current implementation doesn't handle buffered ports.

(tell fd/port) ---> integer (procedure)

Returns the position of the I/O cursor in the the I/O stream. Not all file descriptors or ports support cursor-reporting; this is dependent on the OS implementation.

(open-file fname flags [perms]) ---> port (procedure)

Perms defaults to #o666. Flags is an integer bitmask, composed by or'ing together constants listed in table 1 (page 4). You must use exactly one of the open/read, open/write, or open/read+write flags. The returned port is an input port if the flags permit it, otherwise an output port. R5RS/Scheme 48/scsh do not have input/output ports, so it's one or the other. This should be fixed. (You can hack simultaneous I/O on a file by opening it r/w, taking the result input port, and duping it to an output port with dup->outport.)

(open-input-file fname [flags]) ---> port (procedure)

(open-output-file fname [flags perms]) ---> port (procedure)

These are equivalent to open-file, after first setting the read/write bits of the flags argument to open/read or open/write, respectively. Flags defaults to zero for open-input-file, and
(bitwise-ior open/create open/truncate)
for open-output-file. These defaults make the procedures backwards-compatible with their unary R5RS definitions.

(open-fdes fname flags [perms]) ---> integer (procedure)

Returns a file descriptor.

(fdes-flags fd/port) ---> integer (procedure)

(set-fdes-flags fd/port integer) ---> undefined (procedure)

These procedures allow reading and writing of an open file's flags. The only such flag defined by POSIX is fdflags/close-on-exec; your Unix implementation may provide others.
These procedures should not be particularly useful to the programmer, as the scsh runtime already provides automatic control of the close-on-exec property. Unrevealed ports always have their file descriptors marked close-on-exec, as they can be closed when the scsh process execs a new program. Whenever the user reveals or unreveals a port's file descriptor, the runtime automatically sets or clears the flag for the programmer. Programmers that manipulate this flag should be aware of these extra, automatic operations.

(fdes-status fd/port) ---> integer (procedure)

(set-fdes-status fd/port integer) ---> undefined (procedure)

These procedures allow reading and writing of an open file's status flags (table 1).

Allowed operations Status flag

Open+Get+Set These flags can be used in open-file, fdes-status, and set-fdes-status calls.
open/append

open/non-blocking

open/async (Non-POSIX)

open/fsync (Non-POSIX)

Open+Get These flags can be used in open-file and fdes-status calls, but are ignored by set-fdes-status.
open/read

open/write

open/read+write

open/access-mask

Open These flags are only relevant in open-file calls; they are ignored by fdes-status and set-fdes-status calls.
open/create

open/exclusive

open/no-control-tty

open/truncate

Table 1: Status flags for open-file, fdes-status and set-fdes-status. Only POSIX flags are guaranteed to be present; your operating system may define others. The open/access-mask value is not an actual flag, but a bit mask used to select the field for the open/read, open/write and open/read+write bits.

Note that this file-descriptor state is shared between file descriptors created by dup -- if you create port b by applying dup to port a, and change b's status flags, you will also have changed a's status flags.

(pipe) ---> [rport wport] (procedure)

Returns two ports, the read and write end-points of a Unix pipe.

(read-string nbytes [fd/port]) ---> string or #f (procedure)

(read-string! str [fd/port start end]) ---> nread or #f (procedure)

These calls read exactly as much data as you requested, unless there is not enough data (eof). read-string! reads the data into string str at the indices in the half-open interval [start,end); the default interval is the whole string: start = 0 and end = (string-length string). They will persistently retry on partial reads and when interrupted until (1) error, (2) eof, or (3) the input request is completely satisfied. Partial reads can occur when reading from an intermittent source, such as a pipe or tty.
read-string returns the string read; read-string! returns the number of characters read. They both return false at eof. A request to read zero bytes returns immediately, with no eof check.

The values of start and end must specify a well-defined interval in str, i.e., 0 < start < end < (string-length str).

Any partially-read data is included in the error exception packet. Error returns on non-blocking input are considered an error.

(read-string/partial nbytes [fd/port]) ---> string or #f (procedure)

(read-string!/partial str [fd/port start end]) ---> nread or #f (procedure)

These are atomic best-effort/forward-progress calls. Best effort: they may read less than you request if there is a lesser amount of data immediately available (e.g., because you are reading from a pipe or a tty). Forward progress: if no data is immediately available (e.g., empty pipe), they will block. Therefore, if you request an n>0 byte read, while you may not get everything you asked for, you will always get something (barring eof).
There is one case in which the forward-progress guarantee is cancelled: when the programmer explicitly sets the port to non-blocking I/O. In this case, if no data is immediately available, the procedure will not block, but will immediately return a zero-byte read.

read-string/partial reads the data into a freshly allocated string, which it returns as its value. read-string!/partial reads the data into string str at the indices in the half-open interval [start,end); the default interval is the whole string: start = 0 and end = (string-length string). The values of start and end must specify a well-defined interval in str, i.e., 0 < start < end < (string-length str). It returns the number of bytes read.

A request to read zero bytes returns immediatedly, with no eof check.

In sum, there are only three ways you can get a zero-byte read: (1) you request one, (2) you turn on non-blocking I/O, or (3) you try to read at eof.

These are the routines to use for non-blocking input. They are also useful when you wish to efficiently process data in large blocks, and your algorithm is insensitive to the block size of any particular read operation.

(select rvec wvec evec [timeout]) ---> [rvec' wvec' evec'] (procedure)

(select! rvec wvec evec [timeout]) ---> [nr nw ne] (procedure)

The select procedure allows a process to block and wait for events on multiple I/O channels. The rvec and evec arguments are vectors of input ports and integer file descriptors; wvec is a vector of output ports and integer file descriptors. The procedure returns three vectors whose elements are subsets of the corresponding arguments. Every element of rvec' is ready for input; every element of wvec' is ready for output; every element of evec' has an exceptional condition pending.
The select call will block until at least one of the I/O channels passed to it is ready for operation. For an input port this means that it either has data sitting its buffer or that the underlying file descriptor has data waiting. For an output port this means that it either has space available in the associated buffer or that the underlying file descriptor can accept output. For file descriptors, no buffers are checked, even if they have associated ports.

The timeout value can be used to force the call to time-out after a given number of seconds. It defaults to the special value #f, meaning wait indefinitely. A zero value can be used to poll the I/O channels.

If an I/O channel appears more than once in a given vector -- perhaps occuring once as a Scheme port, and once as the port's underlying integer file descriptor -- only one of these two references may appear in the returned vector. Buffered I/O ports are handled specially -- if an input port's buffer is not empty, or an output port's buffer is not yet full, then these ports are immediately considered eligible for I/O without using the actual, primitive select system call to check the underlying file descriptor. This works pretty well for buffered input ports, but is a little problematic for buffered output ports.

The select! procedure is similar, but indicates the subset of active I/O channels by side-effecting the argument vectors. Non-active I/O channels in the argument vectors are overwritten with #f values. The call returns the number of active elements remaining in each vector. As a convenience, the vectors passed in to select! are allowed to contain #f values as well as integers and ports.

Remark: Select and select! do not call their POSIX counterparts directly -- there is a POSIX select sitting at the very heart of the Scheme 48/scsh I/O system, so all multiplexed I/O is really select-based. Therefore, you cannot expect a performance increase from writing a single-threaded program using select and select! instead of writing a multi-threaded program where each thread handles one I/O connection.
The moral of this story is that select and select! make sense in only two situations: legacy code written for an older version of scsh, and programs which make inherent use of select/select! which do not benefit from multiple threads. Examples are network clients that send requests to multiple alternate servers and discard all but one of them.

In any case, the select-ports and select-port-channels procedures described below are usually a preferable alternative to select/select!: they are much simpler to use, and also have a slightly more efficient implementation.

(select-ports timeout port ...) ---> ready-ports (procedure)

The select-ports call will block until at least one of the ports passed to it is ready for operation or until the timeout has expired. For an input port this means that it either has data sitting its buffer or that the underlying file descriptor has data waiting. For an output port this means that it either has space available in the associated buffer or that the underlying file descriptor can accept output.
The timeout value can be used to force the call to time out after a given number of seconds. A value of #f means to wait indefinitely. A zero value can be used to poll the ports.

Select-ports returns a list of the ports ready for operation. Note that this list may be empty if the timeout expired before any ports became ready.

(select-port-channels timeout port ...) ---> ready-ports (procedure)

Select-port-channels is like select-ports, except that it only looks at the operating system objects the ports refer to, ignoring any buffering performed by the ports.

Remark: Select-port-channels should be used with care: for example, if an input port has data in the buffer but no data available on the underlying file descriptor, select-port-channels will block, even though a read operation on the port would be able to complete without blocking.
Select-port-channels is intended for situations where the program is not checking for available data, but rather for waiting until a port has established a connection -- for example, to a network port.

(write-string string [fd/port start end]) ---> undefined (procedure)

This procedure writes all the data requested. If the procedure cannot perform the write with a single kernel call (due to interrupts or partial writes), it will perform multiple write operations until all the data is written or an error has occurred. A non-blocking I/O error is considered an error. (Error exception packets for this syscall include the amount of data partially transferred before the error occurred.)
The data written are the characters of string in the half-open interval [start,end). The default interval is the whole string: start = 0 and end = (string-length string). The values of start and end must specify a well-defined interval in str, i.e., 0 < start < end < (string-length str). A zero-byte write returns immediately, with no error.

Output to buffered ports: write-string's efforts end as soon as all the data has been placed in the output buffer. Errors and true output may not happen until a later time, of course.

(write-string/partial string [fd/port start end]) ---> nwritten (procedure)

This routine is the atomic best-effort/forward-progress analog to write-string. It returns the number of bytes written, which may be less than you asked for. Partial writes can occur when (1) we write off the physical end of the media, (2) the write is interrrupted, or (3) the file descriptor is set for non-blocking I/O.
If the file descriptor is not set up for non-blocking I/O, then a successful return from these procedures makes a forward progress guarantee -- that is, a partial write took place of at least one byte:

If we are at the end of physical media, and no write takes place, an error exception is raised. So a return implies we wrote something.

If the call is interrupted after a partial transfer, it returns immediately. But if the call is interrupted before any data transfer, then the write is retried.

If we request a zero-byte write, then the call immediately returns 0. If the file descriptor is set for non-blocking I/O, then the call may return 0 if it was unable to immediately write anything (e.g., full pipe). Barring these two cases, a write either returns nwritten > 0, or raises an error exception.

Non-blocking I/O is only available on file descriptors and unbuffered ports. Doing non-blocking I/O to a buffered port is not well-defined, and is an error (the problem is the subsequent flush operation).

Oops: write-string/partial is currently not implemented. Consider using threads to achive the same functionality.

3.2.7 Buffered I/O

Scheme 48 ports use buffered I/O -- data is transferred to or from the OS in blocks. Scsh provides control of this mechanism: the programmer may force saved-up output data to be transferred to the OS when he chooses, and may also choose which I/O buffering policy to employ for a given port (or turn buffering off completely).

It can be useful to turn I/O buffering off in some cases, for example when an I/O stream is to be shared by multiple subprocesses. For this reason, scsh allocates an unbuffered port for file descriptor 0 at start-up time. Because shells frequently share stdin with subprocesses, if the shell does buffered reads, it might ``steal'' input intended for a subprocess. For this reason, all shells, including sh, csh, and scsh, read stdin unbuffered. Applications that can tolerate buffered input on stdin can reset (current-input-port) to block buffering for higher performance.

{Note So support peek-char a Scheme implementation has to maintain a buffer for all input ports. In scsh, for ``unbuffered'' input ports the buffer size is one. As you cannot request less then one character there is no unrequested reading so this can still be called ``unbuffered input''.}

(set-port-buffering port policy [size]) ---> undefined (procedure)

This procedure allows the programmer to assign a particular I/O buffering policy to a port, and to choose the size of the associated buffer. It may only be used on new ports, i.e., before I/O is performed on the port. There are three buffering policies that may be chosen:

bufpol/block General block buffering (general default)

bufpol/line Line buffering (tty default)

bufpol/none Direct I/O -- no buffering ⁴

The line buffering policy flushes output whenever a newline is output; whenever the buffer is full; or whenever an input is read from stdin. Line buffering is the default for ports open on terminal devices.
Oops: The current implementation doesn't support bufpol/line.
The size argument requests an I/O buffer of size bytes. For output ports, size must be non-negative, for input ports size must be positve. If not given, a reasonable default is used. For output ports, if given and zero, buffering is turned off (i.e., size = 0 for any policy is equivalent to policy = bufpol/none). For input ports, setting the size to one corresponds to unbuffered input as defined above. If given, size must be zero respectively one for bufpol/none.

(force-output [fd/port]) ---> undefined (procedure)

This procedure does nothing when applied to an integer file descriptor or unbuffered port. It flushes buffered output when applied to a buffered port, and raises a write-error exception on error. Returns no value.

(flush-all-ports) ---> undefined (procedure)

This procedure flushes all open output ports with buffered data.

3.2.8 File locking

Scsh provides POSIX advisory file locking. Advisory locks are locks that can be checked by user code, but do not affect other I/O operations. For example, if a process has an exclusive lock on a region of a file, other processes will not be able to obtain locks on that region of the file, but they will still be able to read and write the file with no hindrance. Using advisory locks requires cooperation amongst the agents accessing the shared resource.

Remark: Unfortunately, POSIX file locks are associated with actual files, not with associated open file descriptors. Once a process locks a file, using some file descriptor fd, the next time any file descriptor referencing that file is closed, all associated locks are released. This severely limits the utility of POSIX advisory file locks, and we'd recommend caution when using them. It is not without reason that the FreeBSD man pages refer to POSIX file locking as ``completely stupid.''
Scsh moves Scheme ports from file descriptor to file descriptor with dup() and close() as required by the runtime, so it is impossible to keep file locks open across one of these shifts. Hence we can only offer POSIX advisory file locking directly on raw integer file descriptors; regrettably, there are no facilities for locking Scheme ports.

Note that once a Scheme port is revealed in scsh, the runtime will not shift the port around with dup() and close(). This means the file-locking procedures can then be applied to the port's associated file descriptor.

POSIX allows the user to lock a region of a file with either an exclusive or shared lock. Locked regions are described by the lock-region record:

(define-record lock-region exclusive? start len whence proc)

The exclusive? field is true if the lock is exclusive; false if it is shared.
The whence field is one of the values from the seek call: seek/set, seek/delta, or seek/end, and determines the interpretation of the start field:
- If seek/set, the start value is simply an absolute index into the file.
- If seek/delta, the start value is an offset from the file descriptor's current position in the file.
- If seek/end, the start value is an offset from the end of the file.
The region of the file being locked is given by the start and len fields; if len is zero, it means ``infinity,'' that is, the region extends from the starting point through the end of the file, even as the file is extended by subsequent write operations.
The proc field gives the process object for the process holding the region lock, when relevant (see get-lock-region below).

(make-lock-region exclusive? start len [whence]) ---> lock-region (procedure)

This procedure makes a lock-region record. The whence field defaults to seek/set.

(lock-region fdes lock) ---> undefined (procedure)

(lock-region/no-block fdes lock) ---> boolean (procedure)

These procedures lock a region of the file referenced by file descriptor fdes. The lock-region procedure blocks until the lock is granted; the non-blocking variant returns a boolean indicating whether or not the lock was granted. To take an exclusive (write) lock, you must have the file descriptor open with write access; to take a shared (read) lock, you must have the file descriptor open with read access.

(get-lock-region fdes lock) ---> lock-region or #f (procedure)

Return the first lock region on fdes that would conflict with lock region lock. If there is no such lock region, return false. This procedure fills out the proc field of the returned lock region, and is the only procedure that has anything to do with this field. (See section 3.4.1 for a description of process objects.) Note that if you apply this procedure to a file system that is shared across multiple operating systems (i.e., an NFS file system), the proc field may be ambiguous. We note, again, that POSIX advisory file locking is not a terribly useful or well-designed facility.

(unlock-region fdes lock) ---> undefined (procedure)

Release a lock from a file.

(with-region-lock* fdes lock thunk) ---> value(s) of thunk (procedure)

(with-region-lock fdes lock body ...) ---> value(s) of body (syntax)

This procedure obtains the requested lock, and then calls (thunk). When thunk returns, the lock is released. A non-local exit (e.g., throwing to a saved continuation or raising an exception) also causes the lock to be released.
After a normal return from thunk, its return values are returned by with-region-lock*. The with-region-lock special form is equivalent syntactic sugar.

3.3 File system

Besides the following procedures, which allow access to the computer's file system, scsh also provides a set of procedures which manipulate file names. These string-processing procedures are documented in section 5.1.

(create-directory fname [perms override?]) ---> undefined (procedure)

(create-fifo fname [perms override?]) ---> undefined (procedure)

(create-hard-link oldname newname [override?]) ---> undefined (procedure)

(create-symlink old-name new-name [override?]) ---> undefined (procedure)

These procedures create objects of various kinds in the file system.

The override? argument controls the action if there is already an object in the file system with the new name:

#f signal an error (default)

'query prompt the user

other delete the old object (with delete-file or delete-directory, as appropriate) before creating the new object.

Perms defaults to #o777 (but is masked by the current umask).

Remark: Currently, if you try to create a hard or symbolic link from a file to itself, you will error out with override? false, and simply delete your file with override? true. Catching this will require some sort of true-name procedure, which I currently do not have.

(delete-directory fname) ---> undefined (procedure)

(delete-file fname) ---> undefined (procedure)

(delete-filesys-object fname) ---> undefined (procedure)

These procedures delete objects from the file system. The delete-filesys-object procedure will delete an object of any type from the file system: files, (empty) directories, symlinks, fifos, etc..
If the object being deleted doesn't exist, delete-directory and delete-file raise an error, while delete-filesys-object simply returns.

(read-symlink fname) ---> string (procedure)

Return the filename referenced by symbolic link fname.

(rename-file old-fname new-fname [override?]) ---> undefined (procedure)

If you override an existing object, then old-fname and new-fname must type-match -- either both directories, or both non-directories. This is required by the semantics of Unix rename().

Remark: There is an unfortunate atomicity problem with the rename-file procedure: if you specify no-override, but create file new-fname sometime between rename-file's existence check and the actual rename operation, your file will be clobbered with old-fname. There is no way to fix this problem, given the semantics of Unix rename(); at least it is highly unlikely to occur in practice.

(set-file-mode fname/fd/port mode) ---> undefined (procedure)

(set-file-owner fname/fd/port uid) ---> undefined (procedure)

(set-file-group fname/fd/port gid) ---> undefined (procedure)

These procedures set the permission bits, owner id, and group id of a file, respectively. The file can be specified by giving the file name, or either an integer file descriptor or a port open on the file. Setting file user ownership usually requires root privileges.

(set-file-times fname [access-time mod-time]) ---> undefined (procedure)

This procedure sets the access and modified times for the file fname to the supplied values (see section 3.10 for the scsh representation of time). If neither time argument is supplied, they are both taken to be the current time. You must provide both times or neither. If the procedure completes successfully, the file's time of last status-change (ctime) is set to the current time.

(sync-file fd/port) ---> undefined (procedure)

(sync-file-system) ---> undefined (procedure)

Calling sync-file causes Unix to update the disk data structures for a given file. If fd/port is a port, any buffered data it may have is first flushed. Calling sync-file-system synchronises the kernel's entire file system with the disk.
These procedures are not POSIX. Interestingly enough, sync-file-system doesn't actually do what it is claimed to do. We just threw it in for humor value. See the sync(2) man page for Unix enlightenment.

(truncate-file fname/fd/port len) ---> undefined (procedure)

The specified file is truncated to len bytes in length.

(file-info fname/fd/port [chase?]) ---> file-info-record (procedure)

The file-info procedure returns a record structure containing everything there is to know about a file. If the chase? flag is true (the default), then the procedure chases symlinks and reports on the files to which they refer. If chase? is false, then the procedure checks the actual file itself, even if it's a symlink. The chase? flag is ignored if the file argument is a file descriptor or port.
The value returned is a file-info record, defined to have the following structure:

(define-record file-info type ; {block-special, char-special, directory, ; fifo, regular, socket, symlink} device ; Device file resides on. inode ; File's inode. mode ; File's mode bits: permissions, setuid, setgid nlinks ; Number of hard links to this file. uid ; Owner of file. gid ; File's group id. size ; Size of file, in bytes. atime ; Time of last access. mtime ; Time of last mod. ctime) ; Time of last status change.
The uid field of a file-info record is accessed with the procedure
(file-info:uid x)
and similarly for the other fields. The type field is a symbol; all other fields are integers. A file-info record is discriminated with the file-info? predicate.
The following procedures all return selected information about a file; they are built on top of file-info, and are called with the same arguments that are passed to it.

Procedure returns

file-type type

file-inode inode

file-mode mode

file-nlinks nlinks

file-owner uid

file-group gid

file-size size

file-last-access atime

file-last-mod mtime

file-last-status-change ctime

Example:
;; All my files in /usr/tmp: (filter (lambda (f) (= (file-owner f) (user-uid))) (directory-files "/usr/tmp")))

Remark: file-info was named file-attributes in releases of scsh prior to release 0.4. We changed the name to file-info for consistency with the other information-retrieval procedures in scsh: user-info, group-info, host-info, network-info , service-info, and protocol-info.
The file-attributes binding is still supported in the current release of scsh, but is deprecated, and may go away in a future release.

(file-directory? fname/fd/port [chase?]) ---> boolean (procedure)

(file-fifo? fname/fd/port [chase?]) ---> boolean (procedure)

(file-regular? fname/fd/port [chase?]) ---> boolean (procedure)

(file-socket? fname/fd/port [chase?]) ---> boolean (procedure)

(file-special? fname/fd/port [chase?]) ---> boolean (procedure)

(file-symlink? fname/fd/port) ---> boolean (procedure)

These procedures are file-type predicates that test the type of a given file. They are applied to the same arguments to which file-info is applied; the sole exception is file-symlink?, which does not take the optional chase? second argument.

For example,
(file-directory? "/usr/dalbertz") ===> #t

There are variants of these procedures which work directly on file-info records:

(file-info-directory? file-info) ---> boolean (procedure)

(file-info-fifo? file-info) ---> boolean (procedure)

(file-info-regular? file-info) ---> boolean (procedure)

(file-info-socket? file-info) ---> boolean (procedure)

(file-info-special? file-info) ---> boolean (procedure)

(file-info-symlink? file-info) ---> boolean (procedure)

The following set of procedures are a convenient means to work on the permission bits of a file:

(file-not-readable? fname/fd/port) ---> boolean (procedure)

(file-not-writable? fname/fd/port) ---> boolean (procedure)

(file-not-executable? fname/fd/port) ---> boolean (procedure)

Returns:
Value meaning

#f Access permitted

'search-denied
Can't stat -- a protected directory

is blocking access.

'permission Permission denied.

'no-directory Some directory doesn't exist.

'nonexistent File doesn't exist.
A file is considered writeable if either (1) it exists and is writeable or (2) it doesn't exist and the directory is writeable. Since symlink permission bits are ignored by the filesystem, these calls do not take a chase? flag.
Note that these procedures use the process' effective user and group ids for permission checking. POSIX defines an access() function that uses the process' real uid and gids. This is handy for setuid programs that would like to find out if the actual user has specific rights; scsh ought to provide this functionality (but doesn't at the current time).

There are several problems with these procedures. First, there's an atomicity issue. In between checking permissions for a file and then trying an operation on the file, another process could change the permissions, so a return value from these functions guarantees nothing. Second, the code special-cases permission checking when the uid is root -- if the file exists, root is assumed to have the requested permission. However, not even root can write a file that is on a read-only file system, such as a CD ROM. In this case, file-not-writable? will lie, saying that root has write access, when in fact the opening the file for write access will fail. Finally, write permission confounds write access and create access. These should be disentangled.

Some of these problems could be avoided if POSIX had a real-uid variant of the access() call we could use, but the atomicity issue is still a problem. In the final analysis, the only way to find out if you have the right to perform an operation on a file is to try and open it for the desired operation. These permission-checking functions are mostly intended for script-writing, where loose guarantees are tolerated.

(file-readable? fname/fd/port) ---> boolean (procedure)

(file-writable? fname/fd/port) ---> boolean (procedure)

(file-executable? fname/fd/port) ---> boolean (procedure)

These procedures are the logical negation of the preceding file-not-...? procedures. Refer to them for a discussion of their problems and limitations.

(file-info-not-readable? file-info) ---> boolean (procedure)

(file-info-not-writable? file-info) ---> boolean (procedure)

(file-info-not-executable? file-info) ---> boolean (procedure)

(file-info-readable? file-info) ---> boolean (procedure)

(file-info-writable? file-info) ---> boolean (procedure)

(file-info-executable? file-info) ---> boolean (procedure)

There are variants which work directly on file-info records.

(file-not-exists? fname/fd/port [chase?]) ---> object (procedure)

Returns:
#f Exists.

#t Doesn't exist.

'search-denied Some protected directory is blocking the search.

(file-exists? fname/fd/port [chase?]) ---> boolean (procedure)

This is simply (not (file-not-exists? fname [chase?]))

(directory-files [dir dotfiles?]) ---> string list (procedure)

Return the list of files in directory dir, which defaults to the current working directory. The dotfiles? flag (default #f) causes dot files to be included in the list. Regardless of the value of dotfiles?, the two files . and .. are never returned.
The directory dir is not prepended to each file name in the result list. That is,

(directory-files "/etc")
returns
("chown" "exports" "fstab" ...)
not
("/etc/chown" "/etc/exports" "/etc/fstab" ...)
To use the files in returned list, the programmer can either manually prepend the directory:
(map (lambda (f) (string-append dir "/" f)) files)
or cd to the directory before using the file names:
(with-cwd dir (for-each delete-file (directory-files)))
or use the glob procedure, defined below.
A directory list can be generated by (run/strings (ls)), but this is unreliable, as filenames with whitespace in their names will be split into separate entries. Using directory-files is reliable.

(open-directory-stream dir) ---> directory-stream-record (procedure)

(read-directory-stream directory-stream-record) ---> string or #f (procedure)

(close-directory-stream directory-stream-record) ---> undefined (procedure)

These functions implement a direct interface to the opendir()/ readdir()/ closedir() family of functions for processing directory streams. (open-directory-stream dir) creates a stream of files in the directory dir. (read-directory-stream directory-stream) returns the next file in the stream or #fif no such file exists. Finally, (close-directory-stream directory-stream) closes the stream.

(glob pat₁ ...) ---> string list (procedure)

Glob each pattern against the filesystem and return the sorted list. Duplicates are not removed. Patterns matching nothing are not included literally.⁵ C shell {a,b,c} patterns are expanded. Backslash quotes characters, turning off the special meaning of {, }, *, [, ], and ?.
Note that the rules of backslash for Scheme strings and glob patterns work together to require four backslashes in a row to specify a single literal backslash. Fortunately, it is very rare that a backslash occurs in a Unix file name.

A glob subpattern will not match against dot files unless the first character of the subpattern is a literal ``.''. Further, a dot subpattern will not match the files . or .. unless it is a constant pattern, as in (glob "../*/*.c"). So a directory's dot files can be reliably generated with the simple glob pattern ".*".

Some examples:
(glob "*.c" "*.h")
    ;; All the C and #include files in my directory.

(glob "*.c" "*/*.c")
    ;; All the C files in this directory and 
    ;; its immediate subdirectories.

(glob "lexer/*.c" "parser/*.c")
(glob "{lexer,parser}/*.c")
    ;; All the C files in the lexer and parser dirs.

(glob "\\{lexer,parser\\}/*.c")
    ;; All the C files in the strange 
    ;; directory "{lexer,parser}".

(glob "*\\*")
    ;; All the files ending in "*", e.g.
    ;; ("foo*" "bar*")         

(glob "*lexer*")
    ("mylexer.c" "lexer1.notes") 
    ;; All files containing the string "lexer".

(glob "lexer")
    ;; Either ("lexer") or ().
If the first character of the pattern (after expanding braces) is a slash, the search begins at root; otherwise, the search begins in the current working directory.
If the last character of the pattern (after expanding braces) is a slash, then the result matches must be directories, e.g.,

(glob "/usr/man/man?/") ==> ("/usr/man/man1/" "/usr/man/man2/" ...)
Globbing can sometimes be useful when we need a list of a directory's files where each element in the list includes the pathname for the file. Compare:

(directory-files "../include") ==> ("cig.h" "decls.h" ...) (glob "../include/*") ==> ("../include/cig.h" "../include/decls.h" ...)

(glob-quote str) ---> string (procedure)

Returns a constant glob pattern that exactly matches str. All wild-card characters in str are quoted with a backslash.
(glob-quote "Any *.c files?") ==> "Any \*.c files\?"

(file-match root dot-files? pat₁ pat₂ ... pat_n) ---> string list (procedure)

{Note This procedure is deprecated, and will probably either go away or be substantially altered in a future release. New code should not call this procedure. The problem is that it relies upon Posix-notation regular expressions; the rest of scsh has been converted over to the new SRE notation.}
file-match provides a more powerful file-matching service, at the expense of a less convenient notation. It is intermediate in power between most shell matching machinery and recursive find(1).

Each pattern is a regexp. The procedure searches from root, matching the first-level files against pattern pat₁, the second-level files against pat₂, and so forth. The list of files matching the whole path pattern is returned, in sorted order. The matcher uses Spencer's regular expression package.

The files . and .. are never matched. Other dot files are only matched if the dot-files? argument is #t.

A given pat_i pattern is matched as a regexp, so it is not forced to match the entire file name. E.g., pattern "t" matches any file containing a ``t'' in its name, while pattern "^t$" matches only a file whose entire name is ``t''.

The pat_i patterns can be more general than stated above.

A single pattern can specify multiple levels of the path by embedding / characters within the pattern. For example, the pattern "a/b/c" gives a match equivalent to the list of patterns "a" "b" "c".

A pat_i pattern can be a procedure, which is used as a match predicate. It will be repeatedly called with a candidate file-name to test. The file-name will be the entire path accumulated. If the procedure raises an error condition, file-match will catch the error and treat it as a failed match. This keeps file-match from being blown out of the water by applying tests to dangling symlinks and other similar situations.

Some examples:

(file-match "/usr/lib" #f "m$" "^tab") ==> ("/usr/lib/term/tab300" "/usr/lib/term/tab300-12" ...) [0] (file-match "." #f "^lex|parse|codegen$" "\\.c$") ==> ("lex/lex.c" "lex/lexinit.c" "lex/test.c" "parse/actions.c" "parse/error.c" parse/test.c" "codegen/io.c" "codegen/walk.c") [0] (file-match "." #f "^lex|parse|codegen$/\\.c$") ;; The same. [0] (file-match "." #f file-directory?) ;; Return all subdirs of the current directory. [0] (file-match "/" #f file-directory?) ==> ("/bin" "/dev" "/etc" "/tmp" "/usr") ;; All subdirs of root. [0] (file-match "." #f "\\.c") ;; All the C files in my directory. [0] (define (ext extension) (lambda (fn) (string-suffix? fn extension))) [0] (define (true . x) #t) [0] (file-match "." #f "./\\.c") (file-match "." #f "" "\\.c") (file-match "." #f true "\\.c") (file-match "." #f true (ext "c")) ;; All the C files of all my immediate subdirs. [0] (file-match "." #f "lexer") ==> ("mylexer.c" "lexer.notes") ;; Compare with (glob "lexer"), above.

Note that when root is the current working directory ("."), when it is converted to directory form, it becomes "", and doesn't show up in the result file-names.

It is regrettable that the regexp wild card char, ``.'', is such an important file name literal, as dot-file prefix and extension delimiter.

(create-temp-file [prefix]) ---> string (procedure)

Create-temp-file creates a new temporary file and return its name. The optional argument specifies the filename prefix to use, and defaults to the value of "$TMPDIR/pid" if $TMPDIR is set and to "/var/tmp/pid" otherwise, where pid is the current process' id. The procedure generates a sequence of filenames that have prefix as a common prefix, looking for a filename that doesn't already exist in the file system. When it finds one, it creates it, with permission #o600 and returns the filename. (The file permission can be changed to a more permissive permission with set-file-mode after being created).
This file is guaranteed to be brand new. No other process will have it open. This procedure does not simply return a filename that is very likely to be unused. It returns a filename that definitely did not exist at the moment create-temp-file created it.

It is not necessary for the process' pid to be a part of the filename for the uniqueness guarantees to hold. The pid component of the default prefix simply serves to scatter the name searches into sparse regions, so that collisions are less likely to occur. This speeds things up, but does not affect correctness.

Security note: doing I/O to files created this way in /var/tmp/ is not necessarily secure. General users have write access to /var/tmp/, so even if an attacker cannot access the new temp file, he can delete it and replace it with one of his own. A subsequent open of this filename will then give you his file, to which he has access rights. There are several ways to defeat this attack,

Use temp-file-iterate, below, to return the file descriptor allocated when the file is opened. This will work if the file only needs to be opened once.

If the file needs to be opened twice or more, create it in a protected directory, e.g., $HOME.

Ensure that /var/tmp has its sticky bit set. This requires system administrator privileges.

The actual default prefix used is controlled by the dynamic variable *temp-file-template*, and can be overridden for increased security. See temp-file-iterate.

(temp-file-iterate maker [template]) ---> object⁺ (procedure)

*temp-file-template* string

This procedure can be used to perform certain atomic transactions on the file system involving filenames. Some examples:

Linking a file to a fresh backup temp name.

Creating and opening an unused, secure temp file.

Creating an unused temporary directory.

This procedure uses template to generate a series of trial file names. Template is a format control string, and defaults to

"$TMPDIR/pid.~a"
if $TMPDIR is set and
"/var/tmp/pid.~a"
otherwise where pid is the current process' process id. File names are generated by calling format to instantiate the template's ~a field with a varying string.
Maker is a procedure which is serially called on each file name generated. It must return at least one value; it may return multiple values. If the first return value is #f or if maker raises the errno/exist errno exception, temp-file-iterate will loop, generating a new file name and calling maker again. If the first return value is true, the loop is terminated, returning whatever value(s) maker returned.

After a number of unsuccessful trials, temp-file-iterate may give up and signal an error.

Thus, if we ignore its optional prefix argument, create-temp-file could be defined as:

(define (create-temp-file)   (let ((flags (bitwise-ior open/create open/exclusive)))     (temp-file-iterate         (lambda (f)           (close (open-output-file f flags #o600))           f))))
To rename a file to a temporary name:

(temp-file-iterate (lambda (backup)                      (create-hard-link old-file backup)                      backup)                    ".#temp.~a") ; Keep link in cwd. (delete-file old-file)
Recall that scsh reports syscall failure by raising an error exception, not by returning an error code. This is critical to to this example -- the programmer can assume that if the temp-file-iterate call returns, it returns successully. So the following delete-file call can be reliably invoked, safe in the knowledge that the backup link has definitely been established.
To create a unique temporary directory:

(temp-file-iterate (lambda (dir) (create-directory dir) dir)                    "/var/tmp/tempdir.~a")
Similar operations can be used to generate unique symlinks and fifos, or to return values other than the new filename (e.g., an open file descriptor or port).
The default template is in fact taken from the value of the dynamic variable *temp-file-template*, which itself defaults to "$TMPDIR/pid.~a" if $TMPDIR is set and "/usr/tmp/pid.~a" otherwise, where pid is the scsh process' pid. For increased security, a user may wish to change the template to use a directory not allowing world write access (e.g., his home directory).

(temp-file-channel) ---> [inp outp] (procedure)

This procedure can be used to provide an interprocess communications channel with arbitrary-sized buffering. It returns two values, an input port and an output port, both open on a new temp file. The temp file itself is deleted from the Unix file tree before temp-file-channel returns, so the file is essentially unnamed, and its disk storage is reclaimed as soon as the two ports are closed.
Temp-file-channel is analogous to port-pipe with two exceptions:

If the writer process gets ahead of the reader process, it will not hang waiting for some small pipe buffer to drain. It will simply buffer the data on disk. This is good.

If the reader process gets ahead of the writer process, it will also not hang waiting for data from the writer process. It will simply see and report an end of file. This is bad.

In order to ensure that an end-of-file returned to the reader is legitimate, the reader and writer must serialise their I/O. The simplest way to do this is for the reader to delay doing input until the writer has completely finished doing output, or exited.

3.4 Processes

(exec prog arg₁ ...arg_n) ---> no return value (procedure)

(exec-path prog arg₁ ...arg_n) ---> no return value (procedure)

(exec/env prog env arg₁ ...arg_n) ---> no return value (procedure)

(exec-path/env prog env arg₁ ...arg_n) ---> no return value (procedure)

The .../env variants take an environment specified as a string-->string alist. An environment of #t is taken to mean the current process' environment (i.e., the value of the external char **environ).

[Rationale: #f is a more convenient marker for the current environment than #t, but would cause an ambiguity on Schemes that identify #f and ().]

The path-searching variants search the directories in the list exec-path-list for the program. A path-search is not performed if the program name contains a slash character -- it is used directly. So a program with a name like "bin/prog" always executes the program bin/prog in the current working directory. See $path and exec-path-list, below.

Note that there is no analog to the C function execv(). To get the effect just do

(apply exec prog arglist)
All of these procedures flush buffered output and close unrevealed ports before executing the new binary. To avoid flushing buffered output, see %exec below.

Note that the C exec() procedure allows the zeroth element of the argument vector to be different from the file being executed, e.g.
char *argv[] = {"-", "-f", 0};
exec("/bin/csh", argv, envp);
The scsh exec, exec-path, exec/env, and exec-path/env procedures do not give this functionality -- element 0 of the arg vector is always identical to the prog argument. In the rare case the user wishes to differentiate these two items, he can use the low-level %exec and exec-path-search procedures. These procedures never return under any circumstances. As with any other system call, if there is an error, they raise an exception.

(%exec prog arglist env) ---> undefined (procedure)

(exec-path-search fname pathlist) ---> string or #f (procedure)

The %exec procedure is the low-level interface to the system call. The arglist parameter is a list of arguments; env is either a string-->string alist or #t. The new program's argv[0] will be taken from (car arglist), not from prog. An environment of #t means the current process' environment. %exec does not flush buffered output (see flush-all-ports).
All exec procedures, including %exec, coerce the prog and arg values to strings using the usual conversion rules: numbers are converted to decimal numerals, and symbols converted to their print-names.

exec-path-search searches the directories of pathlist looking for an occurrence of file fname. If no executable file is found, it returns #f. If fname contains a slash character, the path search is short-circuited, but the procedure still checks to ensure that the file exists and is executable -- if not, it still returns #f. Users of this procedure should be aware that it invites a potential race condition: between checking the file with exec-path-search and executing it with %exec, the file's status might change. The only atomic way to do the search is to loop over the candidate file names, exec'ing each one and looping when the exec operation fails.

See $path and exec-path-list, below.

(exit [status]) ---> no return value (procedure)

(%exit [status]) ---> no return value (procedure)

These procedures terminate the current process with a given exit status. The default exit status is 0. The low-level %exit procedure immediately terminates the process without flushing buffered output.

(call-terminally thunk) ---> no return value (procedure)

call-terminally calls its thunk. When the thunk returns, the process exits. Although call-terminally could be implemented as
(lambda (thunk) (thunk) (exit 0))
an implementation can take advantage of the fact that this procedure never returns. For example, the runtime can start with a fresh stack and also start with a fresh dynamic environment, where shadowed bindings are discarded. This can allow the old stack and dynamic environment to be collected (assuming this data is not reachable through some live continuation).

(suspend) ---> undefined (procedure)

Suspend the current process with a SIGSTOP signal.

(fork [thunk or #f] [continue-threads?]) ---> proc or #f (procedure)

(%fork [thunk or #f] [continue-threads?]) ---> proc or #f (procedure)

fork with no arguments or #f instead of a thunk is like C fork(). In the parent process, it returns the child's process object (see below for more information on process objects). In the child process, it returns #f.
fork with an argument only returns in the parent process, returning the child's process object. The child process calls thunk and then exits.

fork flushes buffered output before forking, and sets the child process to non-interactive. %fork does not perform this bookkeeping; it simply forks.

The optional boolean argument continue-threads? specifies whether the currently active threads continue to run in the child or not. The default is #f.

(fork/pipe [thunk] [continue-threads?]) ---> proc or #f (procedure)

(%fork/pipe [thunk] [continue-threads?]) ---> proc or #f (procedure)

Like fork and %fork, but the parent and child communicate via a pipe connecting the parent's stdin to the child's stdout. These procedures side-effect the parent by changing his stdin.
In effect, fork/pipe splices a process into the data stream immediately upstream of the current process. This is the basic function for creating pipelines. Long pipelines are built by performing a sequence of fork/pipe calls. For example, to create a background two-process pipe a | b, we write:

(fork (lambda () (fork/pipe a) (b)))
which returns the process object for b's process.
To create a background three-process pipe a | b | c, we write:

(fork (lambda () (fork/pipe a) (fork/pipe b) (c)))
which returns the process object for c's process.
Note that these procedures affect file descriptors, not ports. That is, the pipe is allocated connecting the child's file descriptor 1 to the parent's file descriptor 0. Any previous Scheme port built over these affected file descriptors is shifted to a new, unused file descriptor with dup before allocating the I/O pipe. This means, for example, that the ports bound to (current-input-port) and (current-output-port) in either process are not affected -- they still refer to the same I/O sources and sinks as before. Remember the simple scsh rule: Scheme ports are bound to I/O sources and sinks, not particular file descriptors.

If the child process wishes to rebind the current output port to the pipe on file descriptor 1, it can do this using with-current-output-port or a related form. Similarly, if the parent wishes to change the current input port to the pipe on file descriptor 0, it can do this using set-current-input-port! or a related form. Here is an example showing how to set up the I/O ports on both sides of the pipe:

(fork/pipe (lambda () (with-current-output-port (fdes->outport 1) (display "Hello, world.\n")))) (set-current-input-port! (fdes->inport 0)) (read-line) ; Read the string output by the child.
None of this is necessary when the I/O is performed by an exec'd program in the child or parent process, only when the pipe will be referenced by Scheme code through one of the default current I/O ports.

(fork/pipe+ conns [thunk] [continue-threads?]) ---> proc or #f (procedure)

(%fork/pipe+ conns [thunk] [continue-threads?]) ---> proc or #f (procedure)

Like fork/pipe, but the pipe connections between the child and parent are specified by the connection list conns. See the
(|+ conns pf₁ ... pf_n)
process form for a description of connection lists.

3.4.1 Process objects and process reaping

Scsh uses process objects to represent Unix processes. They are created by the fork procedure, and have the following exposed structure:

(define-record proc pid)

The only exposed slot in a proc record is the process' pid, the integer id assigned by Unix to the process. The only exported primitive procedures for manipulating process objects are proc? and proc:pid. Process objects are created with the fork procedure.

(pid->proc pid [probe?]) ---> proc (procedure)

This procedure maps integer Unix process ids to scsh process objects. It is intended for use in interactive and debugging code, and is deprecated for use in production code. If there is no process object in the system indexed by the given pid, pid->proc's action is determined by the probe? parameter (default #f):

probe? Return

#f signal error condition.

'create Create new proc object.

True value #f

Sometime after a child process terminates, scsh will perform a wait system call on the child in background, caching the process' exit status in the child's proc object. This is called ``reaping'' the process. Once the child has been waited, the Unix kernel can free the storage allocated for the dead process' exit information, so process reaping prevents the process table from becoming cluttered with un-waited dead child processes (a.k.a. ``zombies''). This can be especially severe if the scsh process never waits on child processes at all; if the process table overflows with forgotten zombies, the OS may be unable to fork further processes.

Reaping a child process moves its exit status information from the kernel into the scsh process, where it is cached inside the child's process object. If the scsh user drops all pointers to the process object, it will simply be garbage collected. On the other hand, if the scsh program retains a pointer to the process object, it can use scsh's wait system call to synchronise with the child and retrieve its exit status multiple times (this is not possible with simple Unix integer pids in C -- the programmer can only wait on a pid once).

Thus, process objects allow scsh programmer to do two things not allowed in other programming environments:

Subprocesses that are never waited on are still removed from the process table, and their associated exit status data is eventually automatically garbage collected.
Subprocesses can be waited on multiple times.

However, note that once a child has exited, if the scsh programmer drops all pointers to the child's proc object, the child's exit status will be reaped and thrown away. This is the intended behaviour, and it means that integer pids are not enough to cause a process's exit status to be retained by the scsh runtime. (This is because it is clearly impossible to GC data referenced by integers.)

As a convenience for interactive use and debugging, all procedures that take process objects will also accept integer Unix pids as arguments, coercing them to the corresponding process objects. Since integer process ids are not reliable ways to keep a child's exit status from being reaped and garbage collected, programmers are encouraged to use process objects in production code.

(autoreap-policy [policy]) ---> old-policy (procedure)

The scsh programmer can choose different policies for automatic process reaping. The policy is determined by applying this procedure to one of the values 'early, 'late, or #f (i.e., no autoreap).

early
The child is reaped from the Unix kernel's process table into scsh as soon as it dies. This is done by having a signal handler for the SIGCHLD signal reap the process.

late
The child is not autoreaped until it dies and the scsh program drops all pointers to its process object. That is, the process table is cleaned out during garbage collection.

#f
If autoreaping is turned off, process reaping is completely under control of the programmer, who can force outstanding zombies to be reaped by manually calling the reap-zombies procedure (see below).
Note that under any of the autoreap policies, a particular process p can be manually reaped into scsh by simply calling (wait p). All zombies can be manually reaped with reap-zombies.

The autoreap-policy procedure returns the policy's previous value. Calling autoreap-policy with no arguments returns the current policy without no change.

(reap-zombies) ---> boolean (procedure)

This procedure reaps all outstanding exited child processes into scsh. It returns true if there are no more child processes to wait on, and false if there are outstanding processes still running or suspended.

3.4.1.1 Issues with process reaping

Reaping a process does not reveal its process group at the time of death; this information is lost when the process reaped. This means that a dead, reaped process is not eligible as a return value for a future wait-process-group call. This is not likely to be a problem for most code, as programs almost never wait on exited processes by process group. Process group waiting is usually applied to stopped processes, which are never reaped. So it is unlikely that this will be a problem for most programs.

Automatic process reaping is a useful programming convenience. However, if a program is careful to wait for all children, and does not wish automatic reaping to happen, the programmer can simply turn process autoreaping off.

Programs that do not wish to use automatic process reaping should be aware that some scsh routines create subprocesses but do not return the child's pid: run/port*, and its related procedures and special forms (run/strings, et al.). Automatic process reaping will clean the child processes created by these procedures out of the kernel's process table. If a program doesn't use process reaping, it should either avoid these forms, or use wait-any to wait for the children to exit.

3.4.2 Process waiting

(wait proc/pid [flags]) ---> status (procedure)

This procedure waits until a child process exits, and returns its exit code. The proc/pid argument is either a process object (section 3.4.1) or an integer process id. Wait returns the child's exit status code (or suspension code, if the wait/stopped-children option is used, see below). Status values can be queried with the procedures in section 3.4.3.
The flags argument is an integer whose bits specify additional options. It is composed by or'ing together the following flags:

Flag Meaning

wait/poll Return #f immediately if child still active.

wait/stopped-children Wait for suspend as well as exit.

(wait-any [flags]) ---> [proc status] (procedure)

The optional flags argument is as for wait. This procedure waits for any child process to exit (or stop, if the wait/stopped-children flag is used) It returns the process' process object and status code. If there are no children left for which to wait, the two values [#f #t] are returned. If the wait/poll flag is used, and none of the children are immediately eligble for waiting, then the values [#f #f] are returned:

[#f #f] Poll, none ready

[#f #t] No children

Wait-any will not return a process that has been previously waited by any other process-wait procedure (wait, wait-any, and wait-process-group). It will return reaped processes that haven't yet been waited.

The use of wait-any is deprecated.

(wait-process-group proc/pid [flags]) ---> [proc status] (procedure)

This procedure waits for any child whose process group is proc/pid (either a process object or a pid). The flags argument is as for wait.
Note that if the programmer wishes to wait for exited processes by process group, the program should take care not to use process reaping (section 3.4.1), as this loses process group information. However, most process-group waiting is for stopped processes (to implement job control), so this is rarely an issue, as stopped processes are not subject to reaping.

3.4.3 Analysing process status codes

When a child process dies (or is suspended), its parent can call the wait procedure to recover the exit (or suspension) status of the child. The exit status is a small integer that encodes information describing how the child terminated. The bit-level format of the exit status is not defined by POSIX; you must use the following three functions to decode one. However, if a child terminates normally with exit code 0, POSIX does require wait to return an exit status that is exactly zero. So (zero? status) is a correct way to test for non-error, normal termination, e.g.,

(if (zero? (run (rcp scsh.tar.gz lambda.csd.hku.hk:))) (delete-file "scsh.tar.gz"))

(status:exit-val status) ---> integer or #f (procedure)

(status:stop-sig status) ---> integer or #f (procedure)

(status:term-sig status) ---> integer or #f (procedure)

For a given status value produced by calling wait, exactly one of these routines will return a true value.
If the child process exited normally, status:exit-val returns the exit code for the child process (i.e., the value the child passed to exit or returned from main). Otherwise, this function returns false.

If the child process was suspended by a signal, status:stop-sig returns the signal that suspended the child. Otherwise, this function returns false.

If the child process terminated abnormally, status:term-sig returns the signal that terminated the child. Otherwise, this function returns false.

3.5 Process state

(umask) ---> fixnum (procedure)

(set-umask perms) ---> undefined (procedure)

(with-umask* perms thunk) ---> value(s) of thunk (procedure)

(with-umask perms . body) ---> value(s) of body (syntax)

The process' current umask is retrieved with umask, and set with (set-umask perms). Calling with-umask* changes the umask to perms for the duration of the call to thunk. If the program throws out of thunk by invoking a continuation, the umask is reset to its external value. If the program throws back into thunk by calling a stored continuation, the umask is restored to the perms value. The special form with-umask is equivalent in effect to the procedure with-umask*, but does not require the programmer to explicitly wrap a (lambda () ...) around the body of the code to be executed.

(chdir [fname]) ---> undefined (procedure)

(cwd) ---> string (procedure)

(with-cwd* fname thunk) ---> value(s) of thunk (procedure)

(with-cwd fname . body) ---> value(s) of body (syntax)

These forms manipulate the current working directory. The cwd can be changed with chdir (although in most cases, with-cwd is preferrable). If chdir is called with no arguments, it changes the cwd to the user's home directory. The with-cwd* procedure calls thunk with the cwd temporarily set to fname; when thunk returns, or is exited in a non-local fashion (e.g., by raising an exception or by invoking a continuation), the cwd is returned to its original value. The special form with-cwd is simply syntactic sugar for with-cwd*.

(pid) ---> fixnum (procedure)

(parent-pid) ---> fixnum (procedure)

(process-group) ---> fixnum (procedure)

(set-process-group [proc/pid] pgrp) ---> undefined (procedure)

(pid) and (parent-pid) retrieve the process id for the current process and its parent. (process-group) returns the process group of the current process. A process' process-group can be set with set-process-group; the value proc/pid specifies the affected process. It may be either a process object or an integer process id, and defaults to the current process.

(set-priority which who priority) ---> undefined (procedure)

(priority which who) ---> fixnum (procedure)

(nice [proc/pid delta]) ---> undefined (procedure)

These procedures set and access the priority of processes. I can't remember how set-priority and priority work, so no documentation, and besides, they aren't implemented yet, anyway.

(user-login-name) ---> string (procedure)

(user-uid) ---> fixnum (procedure)

(user-gid) ---> fixnum (procedure)

(user-supplementary-gids) ---> fixnum list (procedure)

(set-uid uid) ---> undefined (procedure)

(set-gid gid) ---> undefined (procedure)

These routines get and set the effective and real user and group ids. The set-uid and set-gid routines correspond to the POSIX setuid() and setgid() procedures.

(user-effective-uid) ---> fixnum (procedure)

(set-user-effective-uid fixnum) ---> undefined (procedure)

(with-user-effective-uid* fixnum thunk) ---> value(s) of thunk (procedure)

(with-user-effective-uid fixnum . body) ---> value(s) of body (syntax)

(user-effective-gid) ---> fixnum (procedure)

(set-user-effective-gid fixnum) ---> undefined (procedure)

(with-user-effective-gid* fixnum thunk) ---> value(s) of thunk (procedure)

(with-user-effective-gid fixnum . body) ---> value(s) of body (syntax)

These forms manipulate the effective user/group IDs. Possible values for setting this resource are either the real user/group ID or the saved set-user/group-ID. The with-... forms perform the ususal temprary assignment during the execution of the second argument. The effective user and group IDs are thread-local.

(process-times) ---> [fixnum fixnum fixnum fixnum] (procedure)

Returns four values:

user CPU time in clock-ticks
system CPU time in clock-ticks
user CPU time of all descendant processes
system CPU time of all descendant processes

Note that CPU time clock resolution is not the same as the real-time clock resolution provided by time+ticks. That's Unix.

(cpu-ticks/sec) ---> integer (procedure)

Returns the resolution of the CPU timer in clock ticks per second. This can be used to convert the times reported by process-times to seconds.

3.6 User and group database access

These procedures are used to access the user and group databases (e.g., the ones traditionally stored in /etc/passwd and /etc/group.)

(user-info uid/name) ---> record (procedure)

Return a user-info record giving the recorded information for a particular user:
(define-record user-info name uid gid home-dir shell)
The uid/name argument is either an integer uid or a string user-name.

(->uid uid/name) ---> fixnum (procedure)

(->username uid/name) ---> string (procedure)

These two procedures coerce integer uid's and user names to a particular form.

(group-info gid/name) ---> record (procedure)

Return a group-info record giving the recorded information for a particular group:
(define-record group-info name gid members)
The gid/name argument is either an integer gid or a string group-name.

3.7 Accessing command-line arguments

command-line-arguments string list

(command-line) ---> string list (procedure)

The list of strings command-line-arguments contains the arguments passed to the scsh process on the command line. Calling (command-line) returns the complete argv string list, including the program. So if we run a scsh program
/usr/shivers/bin/myls -CF src
then command-line-arguments is
("-CF" "src")
and (command-line) returns
("/usr/shivers/bin/myls" "-CF" "src")
command-line returns a fresh list each time it is called. In this way, the programmer can get a fresh copy of the original argument list if command-line-arguments has been modified or is lexically shadowed.

(arg arglist n [default]) ---> string (procedure)

(arg* arglist n [default-thunk]) ---> string (procedure)

(argv n [default]) ---> string (procedure)

These procedures are useful for accessing arguments from argument lists. arg returns the n^th element of arglist. The index is 1-based. If n is too large, default is returned; if no default, then an error is signaled.
arg* is similar, except that the default-thunk is called to generate the default value.

(argv n) is simply (arg (command-line) (+ n 1)). The +1 offset ensures that the two forms

(arg command-line-arguments n) (argv n)
return the same argument (assuming the user has not rebound or modified command-line-arguments).
Example:

(if (null? command-line-arguments) (& (xterm -n ,host -title ,host -name ,(string-append "xterm_" host))) (let* ((progname (file-name-nondirectory (argv 1))) (title (string-append host ":" progname))) (& (xterm -n ,title -title ,title -e ,@command-line-arguments))))
A subtlety: when the scsh interpreter is used to execute a scsh program, the program name reported in the head of the (command-line) list is the scsh program, not the interpreter. For example, if we have a shell script in file fullecho:
#!/usr/local/bin/scsh -s !# (for-each (lambda (arg) (display arg) (display " ")) (command-line))
and we run the program
fullecho hello world
the program will print out
fullecho hello world
not
/usr/local/bin/scsh -s fullecho hello world
This argument line processing ensures that if a scsh program is subsequently compiled into a standalone executable or byte-compiled to a heap-image executable by the Scheme 48 virtual machine, its semantics will be unchanged -- the arglist processing is invariant. In effect, the

/usr/local/bin/scsh -s
is not part of the program; it's a specification for the machine to execute the program on, so it is not properly part of the program's argument list.

3.8 System parameters

(system-name) ---> string (procedure)

Returns the name of the host on which we are executing. This may be a local name, such as ``solar,'' as opposed to a fully-qualified domain name such as ``solar.csie.ntu.edu.tw.''

(uname) ---> uname-record (procedure)

Returns a uname-record of the following structure:
(define-record uname os-name node-name release version machine)
Each of the fields contains a string.

Be aware that POSIX limits the length of all entries to 32 characters, and that the node name does not necessarily correspond to the fully-qualified domain name.

3.9 Signal system

Signal numbers are bound to the variables signal/hup, signal/int, .... See tables 3.9.0.1 and 3 for the full list.

(signal-process proc sig) ---> undefined (procedure)

(signal-process-group prgrp sig) ---> undefined (procedure)

These two procedures send signals to a specific process, and all the processes in a specific process group, respectively. The proc and prgrp arguments are either processes or integer process ids.

(itimer secs) ---> undefined (procedure)

Schedules a timer interrupt in secs seconds.

{Note As the thread system needs the timer interrupt for its own purpose, itimer works by spawning a thread which calls the interrupt handler for interrupt/alrm after the specified time.}

(process-sleep secs) ---> undefined (procedure)

(process-sleep-until time) ---> undefined (procedure)

The sleep procedure causes the process to sleep for secs seconds. The sleep-until procedure causes the process to sleep until time (see section 3.10).
{Note The use of these procedures is deprecated as they suspend all running threads, including the ones Scsh uses for administrtive purposes. Consider using the sleep procedure from the thread package.}

3.9.0.1 Interrupt handlers

Scsh interrupt handlers are complicated by the fact that scsh is implemented on top of the Scheme 48 virtual machine, which has its own interrupt system, independent of the Unix signal system. This means that Unix signals are delivered in two stages: first, Unix delivers the signal to the Scheme 48 virtual machine, then the Scheme 48 virtual machine delivers the signal to the executing Scheme program as a Scheme 48 interrupt. This ensures that signal delivery happens between two VM instructions, keeping individual instructions atomic.

The Scheme 48 machine has its own set of interrupts, which includes the asynchronous Unix signals (table 3.9.0.1).

Interrupt	Unix signal	OS Variant
`interrupt/alrm`⁶	`signal/alrm`	POSIX
`interrupt/int`⁷	`signal/int`	POSIX
`interrupt/memory-shortage`	N/A
`interrupt/chld`	`signal/chld`	POSIX
`interrupt/cont`	`signal/cont`	POSIX
`interrupt/hup`	`signal/hup`	POSIX
`interrupt/quit`	`signal/quit`	POSIX
`interrupt/term`	`signal/term`	POSIX
`interrupt/tstp`	`signal/tstp`	POSIX
`interrupt/usr1`	`signal/usr1`	POSIX
`interrupt/usr2`	`signal/usr2`	POSIX

`interrupt/info`	`signal/info`	BSD only
`interrupt/io`	`signal/io`	BSD + SVR4
`interrupt/poll`	`signal/poll`	SVR4 only
`interrupt/prof`	`signal/prof`	BSD + SVR4
`interrupt/pwr`	`signal/pwr`	SVR4 only
`interrupt/urg`	`signal/urg`	BSD + SVR4
`interrupt/vtalrm`	`signal/vtalrm`	BSD + SVR4
`interrupt/winch`	`signal/winch`	BSD + SVR4
`interrupt/xcpu`	`signal/xcpu`	BSD + SVR4
`interrupt/xfsz`	`signal/xfsz`	BSD + SVR4

Table 2: Scheme 48 virtual-machine interrupts and related Unix signals. Only the POSIX signals are guaranteed to be defined; however, your implementation and OS may define other signals and interrupts not listed here.

Unix signal	Type	OS Variant
`signal/stop`	Uncatchable	POSIX
`signal/kill`	Uncatchable	POSIX

`signal/abrt`	Synchronous	POSIX
`signal/fpe`	Synchronous	POSIX
`signal/ill`	Synchronous	POSIX
`signal/pipe`	Synchronous	POSIX
`signal/segv`	Synchronous	POSIX
`signal/ttin`	Synchronous	POSIX
`signal/ttou`	Synchronous	POSIX

`signal/bus`	Synchronous	BSD + SVR4
`signal/emt`	Synchronous	BSD + SVR4
`signal/iot`	Synchronous	BSD + SVR4
`signal/sys`	Synchronous	BSD + SVR4
`signal/trap`	Synchronous	BSD + SVR4

Table 3: Uncatchable and synchronous Unix signals. While these signals may be sent with signal-process or signal-process-group, there are no corresponding scsh interrupt handlers. Only the POSIX signals are guaranteed to be defined; however, your implementation and OS may define other signals not listed here.

(signal->interrupt integer) ---> integer (procedure)

The programmer maps from Unix signals to Scheme 48 interrupts with the signal->interrupt procedure. If the signal does not have a defined Scheme 48 interrupt, an error is signaled.

(interrupt-set integer₁ ...integer_n) ---> integer (procedure)

This procedure builds interrupt sets from its interrupt arguments. A set is represented as an integer using a two's-complement representation of the bit set.

(enabled-interrupts) ---> interrupt-set (procedure)

(set-enabled-interrupts interrupt-set) ---> interrupt-set (procedure)

Get and set the value of the enabled-interrupt set. Only interrupts in this set have their handlers called when delivered. When a disabled interrupt is delivered to the Scheme 48 machine, it is held pending until it becomes enabled, at which time its handler is invoked.
Interrupt sets are represented as integer bit sets (constructed with the interrupt-set function). The set-enabled-interrupts procedure returns the previous value of the enabled-interrupt set.

(with-enabled-interrupts interrupt-set . body) ---> value(s) of body (syntax)

(with-enabled-interrupts* interrupt-set thunk) ---> value(s) of thunk (procedure)

Run code with a given set of interrupts enabled. Note that ``enabling'' an interrupt means enabling delivery from the Scheme 48 vm to the scsh program. Using the Scheme 48 interrupt system is fairly lightweight, and does not involve actually making a system call. Note that enabling an interrupt means that the assigned interrupt handler is allowed to run when the interrupt is delivered. Interrupts not enabled are held pending when delivered.
Interrupt sets are represented as integer bit sets (constructed with the interrupt-set function).

(set-interrupt-handler interrupt handler) ---> old-handler (procedure)

Assigns a handler for a given interrupt, and returns the interrupt's old handler. The handler argument is #f (ignore), #t (default), or a procedure taking an integer argument; the return value follows the same conventions. Note that the interrupt argument is an interrupt value, not a signal value. An interrupt is delivered to the Scheme 48 machine by (1) blocking all interrupts, and (2) applying the handler procedure to the set of interrupts that were enabled prior to the interrupt delivery. If the procedure returns normally (i.e., it doesn't throw to a continuation), the set of enabled interrupts will be returned to its previous value. (To restore the enabled-interrupt set before throwing out of an interrupt handler, see set-enabled-interrupts)
{Note If you set a handler for the interrupt/chld interrupt, you may break scsh's autoreaping process machinery. See the discussion of autoreaping in section 3.4.1.}

{Note We recommend you avoid using interrupt handlers unless you absolutely have to; Section 9.4 describes a better interface to handling signals.}

(interrupt-handler interrupt) ---> handler (procedure)

Return the handler for a given interrupt. Note that the argument is an interrupt value, not a signal value. A handler is either #f (ignore), #t (default), or a procedure taking an integer argument.

Note that scsh does not support interrupt handlers for ``synchronous'' Unix signals, such as signal/ill or signal/pipe (see table 3). Synchronous occurrences of these signals are better handled by raising a Scheme exception. There are, however, some rare situtations where it is necessary to ignore the occurrence of a synchronous signal. For this case, the following procedures exist:

(ignore-signal integer) ---> undefined (procedure)

(handle-signal-default integer) ---> undefined (procedure)

The procedure ignore-signal tells the process to ignore the given signal. The procedure handle-signal-default resets the signal handler to the default handler.
These procedures manipulate the raw signal handler of the scsh process and therfore undermine the signal handling facility of the VM. They are intended to be used for igoring synchronous signals if system calls cannot succeed otherwise. Do not use these procedures for asynchronous signals!

3.10 Time

Scsh's time system is fairly sophisticated, particularly with respect to its careful treatment of time zones. However, casual users shouldn't be intimidated; all of the complexity is optional, and defaulting all the optional arguments reduces the system to a simple interface.

3.10.1 Terminology

``UTC'' and ``UCT'' stand for ``universal coordinated time,'' which is the official name for what is colloquially referred to as ``Greenwich Mean Time.''

POSIX allows a single time zone to specify two different offsets from UTC: one standard one, and one for ``summer time.'' Summer time is frequently some sort of daylight savings time.

The scsh time package consistently uses this terminology: we never say ``gmt'' or ``dst;'' we always say ``utc'' and ``summer time.''

3.10.2 Basic data types

We have two types: time and date.

A time specifies an instant in the history of the universe. It is location and time-zone independent.⁸ A time is a real value giving the number of elapsed seconds since the Unix ``epoch'' (Midnight, January 1, 1970 UTC). Time values provide arbitrary time resolution, limited only by the number system of the underlying Scheme system.

A date is a name for an instant in time that is specified relative to some location/time-zone in the world, e.g.:

Friday October 31, 1994 3:47:21 pm EST.

Dates provide one-second resolution, and are expressed with the following record type:

(define-record date ; A Posix tm struct seconds ; Seconds after the minute [0-59] minute ; Minutes after the hour [0-59] hour ; Hours since midnight [0-23] month-day ; Day of the month [1-31] month ; Months since January [0-11] year ; Years since 1900 tz-name ; Time-zone name: #f or a string. tz-secs ; Time-zone offset: #f or an integer. summer? ; Summer (Daylight Savings) time in effect? week-day ; Days since Sunday [0-6] year-day) ; Days since Jan. 1 [0-365]

If the tz-secs field is given, it specifies the time-zone's offset from UTC in seconds. If it is specified, the tz-name and summer? fields are ignored when using the date structure to determine a specific instant in time.

If the tz-name field is given, it is a time-zone string such as "EST" or "HKT" understood by the OS. Since POSIX time-zone strings can specify dual standard/summer time-zones (e.g., "EST5EDT" specifies U.S. Eastern Standard/Eastern Daylight Time), the value of the summer? field is used to resolve the amiguous boundary cases. For example, on the morning of the Fall daylight savings change-over, 1:00am-2:00am happens twice. Hence the date 1:30 am on this morning can specify two different seconds; the summer? flag says which one.

A date with tz-name = tz-secs = #f is a date that is specified in terms of the system's current time zone.

There is redundancy in the date data structure. For example, the year-day field is redundant with the month-day and month fields. Either of these implies the values of the week-day field. The summer? and tz-name fields are redundant with the tz-secs field in terms of specifying an instant in time. This redundancy is provided because consumers of dates may want it broken out in different ways. The scsh procedures that produce date records fill them out completely. However, when date records produced by the programmer are passed to scsh procedures, the redundancy is resolved by ignoring some of the secondary fields. This is described for each procedure below.

(make-date s min h mday mon y [tzn tzs summ? wday yday]) ---> date (procedure)

When making a date record, the last five elements of the record are optional, and default to #f, #f, #f, 0, and 0 respectively. This is useful when creating a date record to pass as an argument to time. Other procedures, however, may refuse to work with these incomplete date records.

3.10.3 Time zones

Several time procedures take time zones as arguments. When optional, the time zone defaults to local time zone. Otherwise the time zone can be one of:

#f Local time

Integer Seconds of offset from UTC. For example, New York City is -18000 (-5 hours), San Francisco is -28800 (-8 hours).

String A POSIX time zone string understood by the OS (i.e.., the sort of time zone assigned to the $TZ environment variable).

An integer time zone gives the number of seconds you must add to UTC to get time in that zone. It is not ``seconds west'' of UTC -- that flips the sign.

To get UTC time, use a time zone of either 0 or "UCT0".

3.10.4 Procedures

(time+ticks) ---> [secs ticks] (procedure)

(ticks/sec) ---> real (procedure)

The current time, with sub-second resolution. Sub-second resolution is not provided by POSIX, but is available on many systems. The time is returned as elapsed seconds since the Unix epoch, plus a number of sub-second ``ticks.'' The length of a tick may vary from implementation to implementation; it can be determined from (ticks/sec).
The system clock is not required to report time at the full resolution given by (ticks/sec). For example, on BSD, time is reported at 1µs resolution, so (ticks/sec) is 1,000,000. That doesn't mean the system clock has micro-second resolution.

If the OS does not support sub-second resolution, the ticks value is always 0, and (ticks/sec) returns 1.

Remark: I chose to represent system clock resolution as ticks/sec instead of sec/tick to increase the odds that the value could be represented as an exact integer, increasing efficiency and making it easier for Scheme implementations that don't have sophisticated numeric support to deal with the quantity.
You can convert seconds and ticks to seconds with the expression

(+ secs (/ ticks (ticks/sec)))
Given that, why not have the fine-grain time procedure just return a non-integer real for time? Following Common Lisp, I chose to allow the system clock to report sub-second time in its own units to lower the overhead of determining the time. This would be important for a system that wanted to precisely time the duration of some event. Time stamps could be collected with little overhead, deferring the overhead of precisely calculating with them until after collection.
This is all a bit academic for the Scheme 48 implementation, where we determine time with a heavyweight system call, but it's nice to plan for the future.

(date) ---> date-record (procedure)

(date [time tz]) ---> date-record (procedure)

Simple (date) returns the current date, in the local time zone.
With the optional arguments, date converts the time to the date as specified by the time zone tz. Time defaults to the current time; tz defaults to local time, and is as described in the time-zone section.

If the tz argument is an integer, the date's tz-name field is a POSIX time zone of the form ``UTC+hh:mm:ss''; the trailing :mm:ss portion is deleted if it is zeroes.

Oops: The Posix facility for converting dates to times, mktime(), has a broken design: it indicates an error by returning -1, which is also a legal return value (for date 23:59:59 UCT, 12/31/1969). Scsh resolves the ambiguity in a paranoid fashion: it always reports an error if the underlying Unix facility returns -1. We feel your pain.

(time) ---> integer (procedure)

(time [date]) ---> integer (procedure)

Simple (time) returns the current time.
With the optional date argument, time converts a date to a time. Date defaults to the current date.

Note that the input date record is overconstrained. time ignores date's week-day and year-day fields. If the date's tz-secs field is set, the tz-name and summer? fields are ignored.

If the tz-secs field is #f, then the time-zone is taken from the tz-name field. A false tz-name means the system's current time zone. When calculating with time-zones, the date's summer? field is used to resolve ambiguities:

#f Resolve an ambiguous time in favor of non-summer time.

true Resolve an ambiguous time in favor of summer time.

This is useful in boundary cases during the change-over. For example, in the Fall, when US daylight savings time changes over at 2:00 am, 1:30 am happens twice -- it names two instants in time, an hour apart.
Outside of these boundary cases, the summer? flag is ignored. For example, if the standard/summer change-overs happen in the Fall and the Spring, then the value of summer? is ignored for a January or July date. A January date would be resolved with standard time, and a July date with summer time, regardless of the summer? value.

The summer? flag is also ignored if the time zone doesn't have a summer time -- for example, simple UTC.

(date->string date) ---> string (procedure)

(format-date fmt date) ---> string (procedure)

Date->string formats the date as a 24-character string of the form:
Sun Sep 16 01:03:52 1973
Format-date formats the date according to the format string fmt. The format string is copied verbatim, except that tilde characters indicate conversion specifiers that are replaced by fields from the date record. Figure 1 gives the full set of conversion specifiers supported by format-date.

~~ Converted to the ~ character.

~a abbreviated weekday name

~A full weekday name

~b abbreviated month name

~B full month name

~c time and date using the time and date representation for the locale (~X ~x)

~d day of the month as a decimal number (01-31)

~H hour based on a 24-hour clock as a decimal number (00-23)

~I hour based on a 12-hour clock as a decimal number (01-12)

~j day of the year as a decimal number (001-366)

~m month as a decimal number (01-12)

~M minute as a decimal number (00-59)

~p AM/PM designation associated with a 12-hour clock

~S second as a decimal number (00-61)

~U week number of the year; Sunday is first day of week (00-53)

~w weekday as a decimal number (0-6), where Sunday is 0

~W week number of the year; Monday is first day of week (00-53)

~x date using the date representation for the locale

~X time using the time representation for the locale

~y year without century (00-99)

~Y year with century (e.g.1990)

~Z time zone name or abbreviation, or no characters if no time zone is determinable

Figure 1: format-date conversion specifiers

(fill-in-date! date) ---> date (procedure)

This procedure fills in missing, redundant slots in a date record. In decreasing order of priority:

year, month, month-day ==> year-day
If the year, month, and month-day fields are all defined (are all integers), the year-day field is set to the corresponding value.

year, year-day ==> month, month-day
If the month and month-day fields aren't set, but the year and year-day fields are set, then month and month-day are calculated.

year, month, month-day, year-day ==> week-day
If either of the above rules is able to determine what day it is, the week-day field is then set.

tz-secs ==> tz-name
If tz-secs is defined, but tz-name is not, it is assigned a time-zone name of the form ``UTC+hh:mm:ss''; the trailing :mm:ss portion is deleted if it is zeroes.

tz-name, date, summer? ==> tz-secs, summer?
If the date information is provided up to second resolution, tz-name is also provided, and tz-secs is not set, then tz-secs and summer? are set to their correct values. Summer-time ambiguities are resolved using the original value of summer?. If the time zone doesn't have a summer time variant, then summer? is set to #f.

local time, date, summer? ==> tz-name, tz-secs, summer?
If the date information is provided up to second resolution, but no time zone information is provided (both tz-name and tz-secs aren't set), then we proceed as in the above case, except the system's current time zone is used.

These rules allow one particular ambiguity to escape: if both tz-name and tz-secs are set, they are not brought into agreement. It isn't clear how to do this, nor is it clear which one should take precedence.

Oops: fill-in-date! isn't implemented yet.

3.11 Environment variables

(setenv var val) ---> undefined (procedure)

(getenv var) ---> string (procedure)

These functions get and set the process environment, stored in the external C variable char **environ. An environment variable var is a string. If an environment variable is set to a string val, then the process' global environment structure is altered with an entry of the form "var=val". If val is #f, then any entry for var is deleted.

(env->alist) ---> string-->string alist (procedure)

The env->alist procedure converts the entire environment into an alist, e.g.,
(("TERM" . "vt100") ("SHELL" . "/usr/local/bin/scsh") ("PATH" . "/sbin:/usr/sbin:/bin:/usr/bin") ("EDITOR" . "emacs") ...)

(alist->env alist) ---> undefined (procedure)

Alist must be an alist whose keys are all strings, and whose values are all either strings or string lists. String lists are converted to colon lists (see below). The alist is installed as the current Unix environment (i.e., converted to a null-terminated C vector of "var=val" strings which is assigned to the global char **environ).

;;; Note $PATH entry is converted ;;; to /sbin:/usr/sbin:/bin:/usr/bin. (alist->env '(("TERM" . "vt100") ("PATH" "/sbin" "/usr/sbin" "/bin") ("SHELL" . "/usr/local/bin/scsh")))
Note that env->alist and alist->env are not exact inverses -- alist->env will convert a list value into a single colon-separated string, but env->alist will not parse colon-separated values into lists. (See the $PATH element in the examples given for each procedure.)

The following three functions help the programmer manipulate alist tables in some generally useful ways. They are all defined using equal? for key comparison.

(alist-delete key alist) ---> alist (procedure)

Delete any entry labelled by value key.

(alist-update key val alist) ---> alist (procedure)

Delete key from alist, then cons on a (key . val) entry.

(alist-compress alist) ---> alist (procedure)

Compresses alist by removing shadowed entries. Example:
;;; Shadowed (1 . c) entry removed. (alist-compress '( (1 . a) (2 . b) (1 . c) (3 . d) )) ==> ((1 . a) (2 . b) (3 . d))

(with-env* env-alist-delta thunk) ---> value(s) of thunk (procedure)

(with-total-env* env-alist thunk) ---> value(s) of thunk (procedure)

These procedures call thunk in the context of an altered environment. They return whatever values thunk returns. Non-local returns restore the environment to its outer value; throwing back into the thunk by invoking a stored continuation restores the environment back to its inner value.
The env-alist-delta argument specifies a modification to the current environment -- thunk's environment is the original environment overridden with the bindings specified by the alist delta.

The env-alist argument specifies a complete environment that is installed for thunk.

(with-env env-alist-delta . body) ---> value(s) of body (syntax)

(with-total-env env-alist . body) ---> value(s) of body (syntax)

These special forms provide syntactic sugar for with-env* and with-total-env*. The env alists are not evaluated positions, but are implicitly backquoted. In this way, they tend to resemble binding lists for let and let* forms.

Example: These four pieces of code all run the mailer with special $TERM and $EDITOR values.

(with-env (("TERM" . "xterm") ("EDITOR" . ,my-editor)) (run (mail shivers@lcs.mit.edu))) [0] (with-env* `(("TERM" . "xterm") ("EDITOR" . ,my-editor)) (lambda () (run (mail shivers@csd.hku.hk)))) [0] (run (begin (setenv "TERM" "xterm") ; Env mutation happens (setenv "EDITOR" my-editor) ; in the subshell. (exec-epf (mail shivers@research.att.com)))) [0] ;; In this example, we compute an alternate environment ENV2 ;; as an alist, and install it with an explicit call to the ;; EXEC-PATH/ENV procedure. (let* ((env (env->alist)) ; Get the current environment, (env1 (alist-update env "TERM" "xterm")) ; and compute (env2 (alist-update env1 "EDITOR" my-editor))) ; the new env. (run (begin (exec-path/env "mail" env2 "shivers@cs.cmu.edu"))))

3.11.1 Path lists and colon lists

When environment variables such as $PATH need to encode a list of strings (such as a list of directories to be searched), the common Unix convention is to separate the list elements with colon delimiters.⁹ To convert between the colon-separated string encoding and the list-of-strings representation, see the infix-splitter function (section 8.1.2) and the string library's string-join function. For example,

(define split (infix-splitter (rx ":"))) (split "/sbin:/bin::/usr/bin") ==> '("/sbin" "/bin" "" "/usr/bin") (string-join ":" '("/sbin" "/bin" "" "/usr/bin")) ==> "/sbin:/bin::/usr/bin"

The following two functions are useful for manipulating these ordered lists, once they have been parsed from their colon-separated form.

(add-before elt before list) ---> list (procedure)

(add-after elt after list) ---> list (procedure)

These functions are for modifying search-path lists, where element order is significant.
add-before adds elt to the list immediately before the first occurrence of before in the list. If before is not in the list, elt is added to the end of the list.

add-after is similar: elt is added after the last occurrence of after. If after is not found, elt is added to the beginning of the list.

Neither function destructively alters the original path-list. The result may share structure with the original list. Both functions use equal? for comparing elements.

3.11.2 `$USER`, `$HOME`, and `$PATH`

Like sh and unlike csh, scsh has no interactive dependencies on environment variables. It does, however, initialise certain internal values at startup time from the initial process environment, in particular $HOME and $PATH. Scsh never uses $USER at all. It computes (user-login-name) from the system call (user-uid).

home-directory string

exec-path-list string list thread-fluid

Scsh accesses $HOME at start-up time, and stores the value in the global variable home-directory. It uses this value for ~ lookups and for returning to home on (chdir).
Scsh accesses $PATH at start-up time, colon-splits the path list, and stores the value in the thread fluid exec-path-list. This list is used for exec-path and exec-path/env searches.

To access, rebind or side-effect thread-fluid cells, you must open the thread-fluids package.

3.12 Terminal device control

Scsh provides a complete set of routines for manipulating terminal devices -- putting them in ``raw'' mode, changing and querying their special characters, modifying their I/O speeds, and so forth. The scsh interface is designed both for generality and portability across different Unix platforms, so you don't have to rewrite your program each time you move to a new system. We've also made an effort to use reasonable, Scheme-like names for the multitudinous named constants involved, so when you are reading code, you'll have less likelihood of getting lost in a bewildering maze of obfuscatory constants named ICRNL, INPCK, IUCLC, and ONOCR.

This section can only lay out the basic functionality of the terminal device interface. For further details, see the termios(3) man page on your system, or consult one of the standard Unix texts.

3.12.1 Portability across OS variants

Terminal-control software is inescapably complex, ugly, and low-level. Unix variants each provide their own way of controlling terminal devices, making it difficult to provide interfaces that are portable across different Unix systems. Scsh's terminal support is based primarily upon the POSIX termios interface. Programs that can be written using only the POSIX interface are likely to be widely portable.

The bulk of the documentation that follows consists of several pages worth of tables defining different named constants that enable and disable different features of the terminal driver. Some of these flags are POSIX; others are taken from the two common branches of Unix development, SVR4 and 4.3+ Berkeley. Scsh guarantees that the non-POSIX constants will be bound identifiers.

If your OS supports a particular non-POSIX flag, its named constant will be bound to the flag's value.
If your OS doesn't support the flag, its named constant will be present, but bound to #f.

This means that if you want to use SVR4 or Berkeley features in a program, your program can portably test the values of the flags before using them -- the flags can reliably be referenced without producing OS-dependent ``unbound variable'' errors.

Finally, note that although POSIX, SVR4, and Berkeley cover the lion's share of the terminal-driver functionality, each operating system inevitably has non-standard extensions. While a particular scsh implementation may provide these extensions, they are not portable, and so are not documented here.

3.12.2 Miscellaneous procedures

(tty? fd/port) ---> boolean (procedure)

Return true if the argument is a tty.

(tty-file-name fd/port) ---> string (procedure)

The argument fd/port must be a file descriptor or port open on a tty. Return the file-name of the tty.

3.12.3 The tty-info record type

The primary data-structure that describes a terminal's mode is a tty-info record, defined as follows:

(define-record tty-info control-chars ; String: Magic input chars input-flags ; Int: Input processing output-flags ; Int: Output processing control-flags ; Int: Serial-line control local-flags ; Int: Line-editting UI input-speed ; Int: Code for input speed output-speed ; Int: Code for output speed min ; Int: Raw-mode input policy time) ; Int: Raw-mode input policy

3.12.3.1 The control-characters string

The control-chars field is a character string; its characters may be indexed by integer values taken from table 4.

As discussed above, only the POSIX entries in table 4 are guaranteed to be legal, integer indices. A program can reliably test the OS to see if the non-POSIX characters are supported by checking the index constants. If the control-character function is supported by the terminal driver, then the corresponding index will be bound to an integer; if it is not supported, the index will be bound to #f.

To disable a given control-character function, set its corresponding entry in the tty-info:control-chars string to the special character disable-tty-char (and then use the (set-tty-info fd/port info) procedure to update the terminal's state).

3.12.3.2 The flag fields

The tty-info record's input-flags, output-flags, control-flags, and local-flags fields are all bit sets represented as two's-complement integers. Their values are composed by or'ing together values taken from the named constants listed in tables 5 through 9.

As discussed above, only the POSIX entries listed in these tables are guaranteed to be legal, integer flag values. A program can reliably test the OS to see if the non-POSIX flags are supported by checking the named constants. If the feature is supported by the terminal driver, then the corresponding flag will be bound to an integer; if it is not supported, the flag will be bound to #f.

3.12.3.3 The speed fields

The input-speed and output-speed fields determine the I/O rate of the terminal's line. The value of these fields is an integer giving the speed in bits-per-second. The following speeds are supported by POSIX:

0	134	600	4800
50	150	1200	9600
75	200	1800	19200
110	300	2400	38400

Your OS may accept others; it may also allow the special symbols 'exta and 'extb.

3.12.3.4 The min and time fields

The integer min and time fields determine input blocking behaviour during non-canonical (raw) input; otherwise, they are ignored. See the termios(3) man page for further details.

Be warned that POSIX allows the base system call's representation of the tty-info record to share storage for the min field and the ttychar/eof element of the control-characters string, and for the time field and the ttychar/eol element of the control-characters string. Many implementations in fact do this.

To stay out of trouble, set the min and time fields only if you are putting the terminal into raw mode; set the eof and eol control-characters only if you are putting the terminal into canonical mode. It's ugly, but it's Unix.

3.12.4 Using tty-info records

(make-tty-info if of cf lf ispeed ospeed min time) ---> tty-info-record (procedure)

(copy-tty-info tty-info-record) ---> tty-info-record (procedure)

These procedures make it possible to create new tty-info records. The typical method for creating a new record is to copy one retrieved by a call to the tty-info procedure, then modify the copy as desired. Note that the make-tty-info procedure does not take a parameter to define the new record's control characters.¹⁰ Instead, it simply returns a tty-info record whose control-character string has all elements initialised to ASCII nul. You may then install the special characters by assigning to the string. Similarly, the control-character string in the record produced by copy-tty-info does not share structure with the string in the record being copied, so you may mutate it freely.

(tty-info [fd/port/fname]) ---> tty-info-record (procedure)

The fd/port/fname parameter is an integer file descriptor or Scheme I/O port opened on a terminal device, or a file-name for a terminal device; it defaults to the current input port. This procedure returns a tty-info record describing the terminal's current mode.

(set-tty-info/now fd/port/fname info) ---> no-value (procedure)

(set-tty-info/drain fd/port/fname info) ---> no-value (procedure)

(set-tty-info/flush fd/port/fname info) ---> no-value (procedure)

The fd/port/fname parameter is an integer file descriptor or Scheme I/O port opened on a terminal device, or a file-name for a terminal device. The procedure chosen determines when and how the terminal's mode is altered:

Procedure Meaning

set-tty-info/now Make change immediately.

set-tty-info/drain Drain output, then change.

set-tty-info/flush Drain output, flush input, then change.

Oops: If I had defined these with the parameters in the reverse order, I could have made fd/port/fname optional. Too late now.

3.12.5 Other terminal-device procedures

(send-tty-break [fd/port/fname duration]) ---> no-value (procedure)

The fd/port/fname parameter is an integer file descriptor or Scheme I/O port opened on a terminal device, or a file-name for a terminal device; it defaults to the current output port. Send a break signal to the designated terminal. A break signal is a sequence of continuous zeros on the terminal's transmission line.
The duration argument determines the length of the break signal. A zero value (the default) causes a break of between 0.25 and 0.5 seconds to be sent; other values determine a period in a manner that will depend upon local community standards.

(drain-tty [fd/port/fname]) ---> no-value (procedure)

The fd/port/fname parameter is an integer file descriptor or Scheme I/O port opened on a terminal device, or a file-name for a terminal device; it defaults to the current output port.
This procedure waits until all the output written to the terminal device has been transmitted to the device. If fd/port/fname is an output port with buffered I/O enabled, then the port's buffered characters are flushed before waiting for the device to drain.

(flush-tty/input [fd/port/fname]) ---> no-value (procedure)

(flush-tty/output [fd/port/fname]) ---> no-value (procedure)

(flush-tty/both [fd/port/fname]) ---> no-value (procedure)

The fd/port/fname parameter is an integer file descriptor or Scheme I/O port opened on a terminal device, or a file-name for a terminal device; it defaults to the current input port (flush-tty/input and flush-tty/both), or output port (flush-tty/output).
These procedures discard the unread input chars or unwritten output chars in the tty's kernel buffers.

(start-tty-output [fd/port/fname]) ---> no-value (procedure)

(stop-tty-output [fd/port/fname]) ---> no-value (procedure)

(start-tty-input [fd/port/fname]) ---> no-value (procedure)

(stop-tty-input [fd/port/fname]) ---> no-value (procedure)

These procedures can be used to control a terminal's input and output flow. The fd/port/fname parameter is an integer file descriptor or Scheme I/O port opened on a terminal device, or a file-name for a terminal device; it defaults to the current input or output port.
The stop-tty-output and start-tty-output procedures suspend and resume output from a terminal device. The stop-tty-input and start-tty-input procedures transmit the special STOP and START characters to the terminal with the intention of stopping and starting terminal input flow.

3.12.6 Control terminals, sessions, and terminal process groups

(open-control-tty tty-name [flags]) ---> port (procedure)

This procedure opens terminal device tty-name as the process' control terminal (see the termios man page for more information on control terminals). The tty-name argument is a file-name such as /dev/ttya. The flags argument is a value suitable as the second argument to the open-file call; it defaults to open/read+write, causing the terminal to be opened for both input and output.
The port returned is an input port if the flags permit it, otherwise an output port. R5RS/Scheme 48/scsh do not have input/output ports, so it's one or the other. However, you can get both read and write ports open on a terminal by opening it read/write, taking the result input port, and duping it to an output port with dup->outport.

This procedure guarantees to make the opened terminal the process' control terminal only if the process does not have an assigned control terminal at the time of the call. If the scsh process already has a control terminal, the results are undefined.

To arrange for the process to have no control terminal prior to calling this procedure, use the become-session-leader procedure.

(become-session-leader) ---> integer (procedure)

This is the C setsid() call. POSIX job-control has a three-level hierarchy: session/process-group/process. Every session has an associated control terminal. This procedure places the current process into a brand new session, and disassociates the process from any previous control terminal. You may subsequently use open-control-tty to open a new control terminal.
It is an error to call this procedure if the current process is already a process-group leader. One way to guarantee this is not the case is only to call this procedure after forking.

(tty-process-group fd/port/fname) ---> integer (procedure)

(set-tty-process-group fd/port/fname pgrp) ---> undefined (procedure)

This pair of procedures gets and sets the process group of a given terminal.

(control-tty-file-name) ---> string (procedure)

Return the file-name of the process' control tty. On every version of Unix of which we are aware, this is just the string "/dev/tty". However, this procedure uses the official Posix interface, so it is more portable than simply using a constant string.

3.12.7 Pseudo-terminals

Scsh implements an interface to Berkeley-style pseudo-terminals.

(fork-pty-session thunk) ---> [process pty-in pty-out tty-name] (procedure)

This procedure gives a convenient high-level interface to pseudo-terminals. It first allocates a pty/tty pair of devices, and then forks a child to execute procedure thunk. In the child process

Stdio and the current I/O ports are bound to the terminal device.

The child is placed in its own, new session (see become-session-leader).

The terminal device becomes the new session's controlling terminal (see open-control-tty).

The (error-output-port) is unbuffered.

The fork-pty-session procedure returns four values: the child's process object, two ports open on the controlling pty device, and the name of the child's corresponding terminal device.

(open-pty) ---> pty-inport tty-name (procedure)

This procedure finds a free pty/tty pair, and opens the pty device with read/write access. It returns a port on the pty, and the name of the corresponding terminal device.
The port returned is an input port -- Scheme doesn't allow input/output ports. However, you can easily use (dup->outport pty-inport) to produce a matching output port. You may wish to turn off I/O buffering for this output port.

(pty-name->tty-name pty-name) ---> tty-name (procedure)

(tty-name->pty-name tty-name) ---> pty-name (procedure)

These two procedures map between corresponding terminal and pty controller names. For example,
(pty-name->tty-name "/dev/ptyq3") ==> "/dev/ttyq3" (tty-name->pty-name "/dev/ttyrc") ==> "/dev/ptyrc"

Remark: This is rather Berkeley-specific. SVR4 ptys are rare enough that I've no real idea if it generalises across the Unix gap. Experts are invited to advise. Users feel free to not worry -- the predominance of current popular Unix systems use Berkeley ptys.

(make-pty-generator) ---> procedure (procedure)

This procedure returns a generator of candidate pty names. Each time the returned procedure is called, it produces a new candidate. Software that wishes to search through the set of available ptys can use a pty generator to iterate over them. After producing all the possible ptys, a generator returns #f every time it is called. Example:
(define pg (make-pty-generator)) (pg) ==> "/dev/ptyp0" (pg) ==> "/dev/ptyp1" (pg) ==> "/dev/ptyqe" (pg) ==> "/dev/ptyqf" (Last one) (pg) ==> #f (pg) ==> #f

Scsh	C	Typical char
POSIX
`ttychar/delete-char`	`ERASE`	del
`ttychar/delete-line`	`KILL`	`^U`
`ttychar/eof`	`EOF`	`^D`
`ttychar/eol`	`EOL`
`ttychar/interrupt`	`INTR`	`^C`
`ttychar/quit`	`QUIT`	`^\`
`ttychar/suspend`	`SUSP`	`^Z`
`ttychar/start`	`START`	`^Q`
`ttychar/stop`	`STOP`	`^S`
SVR4 and 4.3+BSD
`ttychar/delayed-suspend`	`DSUSP`	`^Y`
`ttychar/delete-word`	`WERASE`	`^W`
`ttychar/discard`	`DISCARD`	`^O`
`ttychar/eol2`	`EOL2`
`ttychar/literal-next`	`LNEXT`	`^V`
`ttychar/reprint`	`REPRINT`	`^R`
4.3+BSD
`ttychar/status`	`STATUS`	`^T`

Table 4: Indices into the tty-info record's control-chars string, and the character traditionally found at each index. Only the indices for the POSIX entries are guaranteed to be non-#f.

Scsh	C	Meaning
POSIX
`ttyin/check-parity`	`INPCK`	Check parity.
`ttyin/ignore-bad-parity-chars`	`IGNPAR`	Ignore chars with parity errors.
`ttyin/mark-parity-errors`	`PARMRK`	Insert chars to mark parity errors.
`ttyin/ignore-break`	`IGNBRK`	Ignore breaks.
`ttyin/interrupt-on-break`	`BRKINT`	Signal on breaks.
`ttyin/7bits`	`ISTRIP`	Strip char to seven bits.
`ttyin/cr->nl`	`ICRNL`	Map carriage-return to newline.
`ttyin/ignore-cr`	`IGNCR`	Ignore carriage-returns.
`ttyin/nl->cr`	`INLCR`	Map newline to carriage-return.
`ttyin/input-flow-ctl`	`IXOFF`	Enable input flow control.
`ttyin/output-flow-ctl`	`IXON`	Enable output flow control.
SVR4 and 4.3+BSD
`ttyin/xon-any`	`IXANY`	Any char restarts after stop.
`ttyin/beep-on-overflow`	`IMAXBEL`	Ring bell when queue full.
SVR4
`ttyin/lowercase`	`IUCLC`	Map upper case to lower case.

Table 5: Input-flags. These are the named flags for the tty-info record's input-flags field. These flags generally control the processing of input chars. Only the POSIX entries are guaranteed to be non-#f.

Scsh	C	Meaning
3\|l\|POSIX
`ttyout/enable`	`OPOST`	Enable output processing.
3\|l\|SVR4 and 4.3+BSD
`ttyout/nl->crnl`	`ONLCR`	Map nl to cr-nl.
3\|l\|4.3+BSD
`ttyout/discard-eot`	`ONOEOT`	Discard EOT chars.
`ttyout/expand-tabs`	`OXTABS`¹¹	Expand tabs.
3\|l\|SVR4
`ttyout/cr->nl`	`OCRNL`	Map cr to nl.
`ttyout/nl-does-cr`	`ONLRET`	Nl performs cr as well.
`ttyout/no-col0-cr`	`ONOCR`	No cr output in column 0.
`ttyout/delay-w/fill-char`	`OFILL`	Send fill char to delay.
`ttyout/fill-w/del`	`OFDEL`	Fill char is ASCII DEL.
`ttyout/uppercase`	`OLCUC`	Map lower to upper case.

Table 6: Output-flags. These are the named flags for the tty-info record's output-flags field. These flags generally control the processing of output chars. Only the POSIX entries are guaranteed to be non-#f.

	Value	Comment
Backspace delay	`ttyout/bs-delay`	Bit-field mask
	`ttyout/bs-delay0`
	`ttyout/bs-delay1`
Carriage-return delay	`ttyout/cr-delay`	Bit-field mask
	`ttyout/cr-delay0`
	`ttyout/cr-delay1`
	`ttyout/cr-delay2`
	`ttyout/cr-delay3`
Form-feed delay	`ttyout/ff-delay`	Bit-field mask
	`ttyout/ff-delay0`
	`ttyout/ff-delay1`
Horizontal-tab delay	`ttyout/tab-delay`	Bit-field mask
	`ttyout/tab-delay0`
	`ttyout/tab-delay1`
	`ttyout/tab-delay2`
	`ttyout/tab-delayx`	Expand tabs
Newline delay	`ttyout/nl-delay`	Bit-field mask
	`ttyout/nl-delay0`
	`ttyout/nl-delay1`
Vertical tab delay	`ttyout/vtab-delay`	Bit-field mask
	`ttyout/vtab-delay0`
	`ttyout/vtab-delay1`
All	`ttyout/all-delay`	Total bit-field mask

Table 7: Delay constants. These are the named flags for the tty-info record's output-flags field. These flags control the output delays associated with printing special characters. They are non-POSIX, and have non-#f values only on SVR4 systems.

Scsh	C	Meaning
3\|l\|POSIX
`ttyc/char-size`	`CSIZE`	Character size mask
`ttyc/char-size5`	`CS5`	5 bits
`ttyc/char-size6`	`CS6`	6 bits
`ttyc/char-size7`	`CS7`	7 bits
`ttyc/char-size8`	`CS8`	8 bits
`ttyc/enable-parity`	`PARENB`	Generate and detect parity.
`ttyc/odd-parity`	`PARODD`	Odd parity.
`ttyc/enable-read`	`CREAD`	Enable reception of chars.
`ttyc/hup-on-close`	`HUPCL`	Hang up on last close.
`ttyc/no-modem-sync`	`LOCAL`	Ignore modem lines.
`ttyc/2-stop-bits`	`CSTOPB`	Send two stop bits.
3\|l\|4.3+BSD
`ttyc/ignore-flags`	`CIGNORE`	Ignore control flags.
`ttyc/CTS-output-flow-ctl`	`CCTS_OFLOW`	CTS flow control of output
`ttyc/RTS-input-flow-ctl`	`CRTS_IFLOW`	RTS flow control of input
`ttyc/carrier-flow-ctl`	`MDMBUF`

Table 8: Control-flags. These are the named flags for the tty-info record's control-flags field. These flags generally control the details of the terminal's serial line. Only the POSIX entries are guaranteed to be non-#f.

Scsh	C	Meaning
3\|l\|POSIX
`ttyl/canonical`	`ICANON`	Canonical input processing.
`ttyl/echo`	`ECHO`	Enable echoing.
`ttyl/echo-delete-line`	`ECHOK`	Echo newline after line kill.
`ttyl/echo-nl`	`ECHONL`	Echo newline even if echo is off.
`ttyl/visual-delete`	`ECHOE`	Visually erase chars.
`ttyl/enable-signals`	`ISIG`	Enable `^`C, `^`Z signalling.
`ttyl/extended`	`IEXTEN`	Enable extensions.
`ttyl/no-flush-on-interrupt`	`NOFLSH`	Don't flush after interrupt.
`ttyl/ttou-signal`	`ITOSTOP`	`SIGTTOU` on background output.
3\|l\|SVR4 and 4.3+BSD
`ttyl/echo-ctl`	`ECHOCTL`	Echo control chars as ```^X`''.
`ttyl/flush-output`	`FLUSHO`	Output is being flushed.
`ttyl/hardcopy-delete`	`ECHOPRT`	Visual erase for hardcopy.
`ttyl/reprint-unread-chars`	`PENDIN`	Retype pending input.
`ttyl/visual-delete-line`	`ECHOKE`	Visually erase a line-kill.
3\|l\|4.3+BSD
`ttyl/alt-delete-word`	`ALTWERASE`	Alternate word erase algorithm
`ttyl/no-kernel-status`	`NOKERNINFO`	No kernel status on `^T`.
3\|l\|SVR4
`ttyl/case-map`	`XCASE`	Canonical case presentation

Table 9: Local-flags. These are the named flags for the tty-info record's local-flags field. These flags generally control the details of the line-editting user interface. Only the POSIX entries are guaranteed to be non-#f.

³ Why not move->fdes? Because the current output port and error port might be the same port.

⁴ But see the note above

⁵ Why bother to mention such a silly possibility? Because that is what sh does.

⁶ Also bound to Scheme 48 interrupt interrupt/alarm.

⁷ Also bound to Scheme 48 interrupt interrupt/keyboard.

⁸ Physics pedants please note: The scsh authors live in a Newtonian universe. We disclaim responsibility for calculations performed in non-ANSI standard light-cones.

⁹ ...and hope the individual list elements don't contain colons themselves.

¹⁰ Why? Because the length of the string varies from Unix to Unix. For example, the word-erase control character (typically control-w) is provided by most Unixes, but not part of the POSIX spec.

¹¹ Note this is distinct from the SVR4-equivalent ttyout/tab-delayx flag defined in table 7.

`bufpol/block`	General block buffering (general default)
`bufpol/line`	Line buffering (tty default)
`bufpol/none`	Direct I/O -- no buffering ⁴

`#f`	signal an error (default)
'query	prompt the user
other	delete the old object (with `delete-file` or `delete-directory,` as appropriate) before creating the new object.

Procedure	returns
`file-type`	type
`file-inode`	inode
`file-mode`	mode
`file-nlinks`	nlinks
`file-owner`	uid
`file-group`	gid
`file-size`	size
`file-last-access`	atime
`file-last-mod`	mtime
`file-last-status-change`	ctime

`#f`	Exists.
`#t`	Doesn't exist.
'search-denied	Some protected directory is blocking the search.

`#f`	Local time
Integer	Seconds of offset from UTC. For example, New York City is -18000 (-5 hours), San Francisco is -28800 (-8 hours).
String	A POSIX time zone string understood by the OS (i.e.., the sort of time zone assigned to the `$TZ` environment variable).

`#f`	Resolve an ambiguous time in favor of non-summer time.
true	Resolve an ambiguous time in favor of summer time.