Reading delimited strings

Scsh provides a set of procedures that read delimited strings from input ports. There are procedures to read a single line of text (terminated by a newline character), a single paragraph (terminated by a blank line), and general delimited strings (terminated by a character belonging to an arbitrary character set).

These procedures can be applied to any Scheme input port. However, the scsh virtual machine has native-code support for performing delimited reads on Unix ports, and these input operations should be particularly fast -- much faster than doing the equivalent character-at-a-time operation from Scheme code.

All of the delimited input operations described below take a handle-delim parameter, which determines what the procedure does with the terminating delimiter character. There are four possible choices for a handle-delim parameter:

handle-delim Meaning
'trim Ignore delimiter character.
'peek Leave delimiter character in input stream.
'concat Append delimiter character to returned value.
'split Return delimiter as second value.
The first case, 'trim, is the standard default for all the routines described in this section. The last three cases allow the programmer to distinguish between strings that are terminated by a delimiter character, and strings that are terminated by an end-of-file.

(read-line [port handle-newline])     --->     string or eof-object         (procedure) 
Reads and returns one line of text; on eof, returns the eof object. A line is terminated by newline or eof.

handle-newline determines what read-line does with the newline or EOF that terminates the line; it takes the general set of values described for the general handle-delim case above, and defaults to 'trim (discard the newline). Using this argument allows one to tell whether or not the last line of input in a file is newline terminated.

(read-paragraph [port handle-delim])     --->     string or eof         (procedure) 
This procedure skips blank lines, then reads text from a port until a blank line or eof is found. A ``blank line'' is a (possibly empty) line composed only of white space. The handle-delim parameter determines how the terminating blank line is handled. It is described above, and defaults to 'trim. The 'peek option is not available.

The following procedures read in strings from ports delimited by characters belonging to a specific set. See section 5.5 for information on character set manipulation.

(read-delimited char-set [port handle-delim])     --->     string or eof         (procedure) 
Read until we encounter one of the chars in char-set or eof. The handle-delim parameter determines how the terminating character is handled. It is described above, and defaults to 'trim.

The char-set argument may be a charset, a string, or a character; it is coerced to a charset.

(read-delimited! char-set buf [port handle-delim start end])     --->     nchars or eof or #f         (procedure) 
A side-effecting variant of read-delimited.

The data is written into the string buf at the indices in the half-open interval [start,end); the default interval is the whole string: start = 0 and end = (string-length buf). The values of start and end must specify a well-defined interval in str, i.e., 0 < start < end < (string-length buf).

It returns nbytes, the number of bytes read. If the buffer filled up without a delimiter character being found, #f is returned. If the port is at eof when the read starts, the eof object is returned.

If an integer is returned (i.e., the read is successfully terminated by reading a delimiter character), then the handle-delim parameter determines how the terminating character is handled. It is described above, and defaults to 'trim.

(%read-delimited! char-set buf gobble? [port start end])     --->     [char-or-eof-or-#f integer]         (procedure) 
This low-level delimited reader uses an alternate interface. It returns two values: terminator and num-read.
terminator
A value describing why the read was terminated:
Character or eof-object Read terminated by this value.
#f Filled buffer without finding a delimiter.

num-read
Number of characters read into buf.

If the read is successfully terminated by reading a delimiter character, then the gobble? parameter determines what to do with the terminating character. If true, the character is removed from the input stream; if false, the character is left in the input stream where a subsequent read operation will retrieve it. In either case, the character is also the first value returned by the procedure call.

(skip-char-set skip-chars [port])     --->     integer         (procedure) 
Skip characters occurring in the set skip-chars; return the number of characters skipped. The skip-chars argument may be a charset, a string, or a character; it is coerced to a charset.