PLT MzScheme: Language Manual

Chapter 3

Basic Data Extensions

3.1 Void and Undefined

MzScheme returns the unique void value -- printed as #<void> -- for expressions that have unspecified results in R5RS. The procedure void takes any number of arguments and returns void:

(void v ···) returns void.
(void? v) returns #t if v is void, #f otherwise.

Variables bound by letrec-values that are accessible but not yet initialized are bound to the unique undefined value, printed as #<undefined>.

Unless otherwise specified, two instances of a particular MzScheme data type are equal? only when they are eq?. Two values are eqv? only when they are either eq?, = and have the same exactness, or both +nan.0.

The andmap and ormap procedures apply a test procedure to the elements of a list, returning immediately when the result for testing the entire list is determined. The arguments to andmap and ormap are the same as for map, but a single boolean value is returned as the result, rather than a list:

(andmap proc list ···¹) applies proc to elements of the lists from the first elements to the last, returning #f as soon as any application returns #f. If no application of proc returns #f, then the result of the last application of proc is returned. If the lists are empty, then #t is returned.
(ormap proc list ···¹) applies proc to elements of the lists from the first elements to the last. If any application returns a value other than #f, that value is immediately returned as the result of the ormap application. If all applications of proc return #f, then the result is #f. If the lists are empty, then #f is returned.

Examples:

(andmap positive? '(1 2 3)) ; => #t 
(ormap eq? '(a b c) '(a b c)) ; => #t 
(andmap positive? '(1 2 a)) ; => raises exn:application:type 
(ormap positive? '(1 2 a)) ; => #t 
(andmap positive? '(1 -2 a)) ; => #f 
(andmap + '(1 2 3) '(4 5 6)) ; => 9 
(ormap + '(1 2 3) '(4 5 6)) ; => 5

3.3 Numbers

A number in MzScheme is one of the following:

a fixnum exact integer (30 bits ² plus a sign bit)
a bignum exact integer (cannot be represented in a fixnum)
a fraction exact rational (represented by two exact integers)
a flonum inexact rational (double-precision floating-point number)
a complex number; either the real and imaginary parts are both exact or inexact, or the number has an exact zero real part and an inexact imaginary part; a complex number with an inexact zero imaginary part is a real number

MzScheme extends the number syntax of R5RS in two ways:

All input radixes (#b, #o, #d, and #x) allow ``decimal'' numbers that contain a period or exponent marker. For example, #b1.1 is equivalent to 1.5. In hexadecimal numbers, e always stands for a hexadecimal digit, not an exponent marker.
The following are inexact numerical constants: +inf.0 (infinity), -inf.0 (negative infinity), +nan.0 (not a number), and -nan.0 (same as +nan.0). These names can also be used within complex constants, as in -inf.0+inf.0i.

The special inexact numbers +inf.0, -inf.0, and +nan.0 have no exact form. Dividing by an inexact zero returns +inf.0 or -inf.0, depending on the sign of the dividend. The infinities are integers, and they answer #t for both even? and odd?. The +nan.0 value is not an integer and is not = to itself, but +nan.0 is eqv? to itself.³ Similarly, (= 0.0 -0.0) is #t, but (eqv? 0.0 -0.0) is #f.

All multi-argument arithmetic procedures operate pairwise on arguments from left to right.

The string->number procedure works on all number representations and exact integer radix values in the range 2 to 16 (inclusive). The number->string procedure accepts all number types and the radix values 2, 8, 10, and 16; however, if an inexact number is provided with a radix other than 10, the exn:application:mismatch exception is raised.

The add1 and sub1 procedures work on any number:

(add1 z) returns z + 1.
(sub1 z) returns z - 1.

The following procedures work on exact integers in their (semi-infinite) two's complement representation:

(bitwise-ior n ···¹) returns the bitwise ``inclusive or'' of the ns.
(bitwise-and n ···¹) returns the bitwise ``and'' of the ns.
(bitwise-xor n ···¹) returns the bitwise ``exclusive or'' of the ns.
(bitwise-not n) returns the bitwise ``not'' of n.
(arithmetic-shift n m) returns the bitwise ``shift'' of n. The integer n is shifted left by m bits; i.e., m new zeros are introduced as rightmost digits. If m is negative, n is shifted right by - m bits; i.e., the rightmost m digits are dropped.

The random procedure generates pseudo-random integers:

(random k) returns a random exact integer in the range 0 to k - 1 where k is an exact integer between 1 and 2³¹ - 1, inclusive. The number is provided by the current pseudo-random number generator, which maintains an internal state for generating numbers.⁴
(random-seed k) seeds the current pseudo-random number generator with k, an exact integer between 0 and 2³¹ - 1, inclusive. Seeding a generator sets its internal state deterministically; seeding a generator with a particular number forces it to produce a sequence of pseudo-random numbers that is the same across runs and across platforms.
(current-pseudo-random-generator) returns the current pseudo-random number generator, and (current-pseudo-random-generator generator) sets the current generator to generator. See also section 7.4.1.10.
(make-pseudo-random-generator) returns a new pseudo-random number generator. The new generator is seeded with a number derived from (current-milliseconds).
(pseudo-random-generator? v) returns #t if v is a pseudo-random number generator, #f otherwise.

The following procedures convert between Scheme numbers and common machine byte representations:

(integer-byte-string->integer string signed? [big-endian?]) converts the machine-format number encoded in string to an exact integer. The string must contain either 2, 4, or 8 characters. If signed? is true, then the string is decoded as a two's-complement number, otherwise it is decoded as an unsigned integer. If big-endian? is true, then the first character's ASCII value provides the most siginficant eight bits of the number, otherwise the first character provides the least-significant eight bits, and so on. The default value of big-endian? is the result of system-big-endian?.
(integer->integer-byte-string n size-n signed? [big-endian? to-string]) converts the exact integer n to a machine-format number encoded in a string of length size-n, which must be 2, 4, or 8. If signed? is true, then the number is encoded with two's complement, otherwise it is encoded as an unsigned bit stream. If big-endian? is true, then the most significant eight bits of the number are encoded in the first character of the resulting string, otherwise the least-significant bits are encoded in the first character, and so on. The default value of big-endian? is the result of system-big-endian?.

If to-string is provided, it must be a mutable string of length size-n; in that case, the encoding of n is written into to-string, and to-string is returned as the result. If to-string is not provided, the result is a newly allocated string.

If n cannot be encoded in a string of the requested size and format, the exn:misc:application exception is raised. If to-string is provided and it is not of length size-n, the exn:misc:application exception is raised.
(floating-point-byte-string->real string [big-endian?]) converts the IEEE floating-point number encoded in string to an inexact real number. The string must contain either 4 or 8 characters. If big-endian? is true, then the first character's ASCII value provides the most siginficant eight bits of the IEEE representation, otherwise the first character provides the least-significant eight bits, and so on. The default value of big-endian? is the result of system-big-endian?.
(real->floating-point-byte-string x size-n [big-endian? to-string]) converts the real number x to its IEEE representation in a string of length size-n, which must be 4 or 8. If big-endian? is true, then the most significant eight bits of the number are encoded in the first character of the resulting string, otherwise the least-significant bits are encoded in the first character, and so on. The default value of big-endian? is the result of system-big-endian?.

If to-string is provided, it must be a mutable string of length size-n; in that case, the encoding of n is written into to-string, and to-string is returned as the result. If to-string is not provided, the result is a newly allocated string.

If to-string is provided and it is not of length size-n, the exn:misc:application exception is raised.
(system-big-endian?) returns #t if the native encoding of numbers is big-endian for the machine running MzScheme, #f if the native encoding is little-endian.

3.4 Characters

MzScheme character values range over the characters for ``extended ASCII'' values 0 to 255 (where the ASCII extensions are platform-specific). The procedure char->integer returns the extended ASCII value of a character and integer->char takes an extended ASCII value and returns the corresponding character. If integer->char is given an integer that is not in 0 to 255 inclusive, the exn:application:type exception is raised.

The procedures char->latin-1-integer and latin-1-integer->char support conversions between characters in the platform-specific character set and platform-independent Latin-1 (ISO 8859-1) values:

(char->latin-1-integer char) returns the integer in 0 to 255 inclusive corresponding to the Latin-1 value for char, or #f if char (in the platform-specific character set) has no corresponding character in Latin-1.
(latin-1-integer->char k) returns the character corresponding to the Latin-1 mapping of k, or #f if the platform-specific character set does not support the corresponding Latin-1 character. If k is not in 0 to 255 inclusive, the exn:application:type exception is raised.

For Unix and Mac OS, char->latin-1-integer and latin-1-integer->char are the same as char->integer and integer->char. For Windows, the platform-specific set and Latin-1 match except for the range #x80 to #x9F (which are unprintable control characters in Latin-1).

The character comparison procedures -- char=?, char<?, char-ci=?, etc. -- take two or more character arguments and check the arguments pairwise (like the numerical comparison procedures). Two characters are eq? whenever they are char=?. The expression (char<? char1 char2) produces the same result as (< (char->integer char1) (char->integer char2)), etc. The procedures char-whitespace?, char-alphabetic?, char-numeric?, char-upper-case?, and char-upper-case?, char-upcase, and char-downcase are fully portable; their results do not depend on the platform or locales.

In addition to the standard character procedures, MzScheme provides the following locale-sensitive procedures (see section 7.4.1.11):

(char-locale<? char1 char2 ···¹)
(char-locale>? char1 char2 ···¹)
(char-locale-ci=? char1 char2 ···¹)
(char-locale-ci<? char1 char2 ···¹)
(char-locale-ci>? char1 char2 ···¹)
(char-locale-whitespace? char)
(char-locale-alphabetic? char)
(char-locale-numeric? char)
(char-locale-upper-case? char)
(char-locale-lower-case? char)
(char-locale-upcase char)
(char-locale-downcase char)

For example, since ASCII character 112 is a lowercase ``p'' and Latin-1 character 246 is a lowercase ``ddoto'' (with an umlaut), (char-locale<? (integer->char 112) (integer->char 246)) tends to produce #f, though it always produces #t if the current locale is disabled.

3.5 Strings

A string can be mutable or immutable. When an immutable string is provided to a procedure like string-set!, the exn:application:type exception is raised.

String constants generated by read are immutable. (string->immutable-string string) returns an immutable string with the same content as string, returning string if it is already an immutable string. (See also immutable? in section 3.8.)

When a string is created with make-string without a fill value, it is initialized with the null character (#\nul) in all positions.

The string comparison procedures -- string=?, string<?, string-ci=?, etc. -- take two or more string arguments and check the arguments pairwise (like the numerical comparison procedures). String comparisons using the standard functions are fully portable; the results do not depend on the platform or locales.

In addition to the string character procedures, MzScheme provides the following locale-sensitive procedures (see section 7.4.1.11):

(string-locale<? string1 string2 ···¹)
(string-locale>? string1 string2 ···¹)
(string-locale-ci=? string1 string2 ···¹)
(string-locale-ci<? string1 string2 ···¹)
(string-locale-ci>? string1 string2 ···¹)

3.6 Symbols

For information about symbol parsing and printing, see section 14.3 and section 14.4, respectively.

MzScheme provides two ways of generating an uninterned symbol, i.e., a symbol that is not eq?, eqv?, or equal? to any other symbol, although it may print the same as another symbol:

(string->uninterned-symbol string) is like (string->symbol string), but the resulting symbol is a new uninterned symbol. Calling string->uninterned-symbol twice with the same string returns two distinct symbols.
(gensym [symbol/string]) creates an uninterned symbol with an automatically-generated name. The optional symbol/string argument is a prefix symbol or string.

Regular (interned) symbols are only weakly held by the internal symbol table. This weakness can never affect the result of a eq?, eqv?, or equal? test, but a symbol placed into a weak box (see section 13.1) or used as the key in a weak hash table (see section 3.12) may disappear.

3.7 Vectors

When a vector is created with make-vector without a fill value, it is initialized with 0 in all positions. A vector can be immutable, such as a vector returned by syntax-e, but vectors generated by read are mutable. (See also immutable? in section 3.8.)

3.8 Lists

A cons cell can be mutable or immutable. When an immutable cons cell is provided to a procedure like set-cdr!, the exn:application:type exception is raised. Cons cells generated by read are always mutable.

The global variable null is bound to the empty list.

(reverse! list) is the same as (reverse list), but list is destructively reversed using set-cdr!.

(append! list ···¹) destructively appends the lists.

(list* v ···¹) is similar to (list v ···¹) but the last argument is used directly as the cdr of the last pair constructed for the list:

(list* 1 2 3 4) ; => '(1 2 3 . 4)

(cons-immutable v1 v2) returns an immutable pair whose car is v1 and cdr is v2.

(list-immutable v ···¹) is like (list v ···¹), but using immutable pairs.

(list*-immutable v ···¹) is like (list* v ···¹), but using immutable pairs.

(immutable? v) returns #t if v is an immutable cons cell, string, vector, or box, #f otherwise.

The list-ref and list-tail procedures accept an improper list as a first argument. If either procedure is applied to an improper list and an index that would require taking the car or cdr of a non-cons-cell, the exn:application:mismatch exception is raised.

The member, memv, and memq procedures accept an improper list as a second argument. If the membership search reaches the improper tail, the exn:application:mismatch exception is raised.

The assoc, assv, and assq procedures accept an improperly formed association list as a second argument. If the association search reaches an improper list tail or a list element that is not a pair, the exn:application:mismatch exception is raised.

3.9 Boxes

MzScheme provides boxes, records with a single mutable field:

(box v) returns a new box that contains v.
(unbox box) returns the content of box. For any v, (unbox (box v)) returns v.
(set-box! box v) sets the content of box to v.
(box? v) returns #t if v is a box, #f otherwise.

Two boxes are equal? if the contents of the boxes are equal?.

A box returned by syntax-e (see section 12.2.2) is immutable; if set-box! is applied to such a box, the exn:application:type exception is raised. A box produced by read (via #&) is mutable. (See also immutable? in section 3.8.)

3.10 Procedures

See section 4.4 for information on defining new procedure types.

3.10.1 Arity

MzScheme's procedure-arity procedure returns the input arity of a procedure:

(procedure-arity proc) returns information about the number of arguments accepted by the procedure proc. The result a is either:
- an exact non-negative integer ==> the procedure always takes exactly a arguments;
- an arity-at-least⁵ instance ==> the procedure takes (arity-at-least-value a) or more arguments; or
- a list containing integers and arity-at-least instances ==> the procedure takes any number of arguments that can match one of the arities in the list.
(procedure-arity-includes? proc k) returns #t if the procedure can accept n arguments (where k is an exact, non-negative integer), #f otherwise.

Examples:

(procedure-arity cons) ; => 2 
(procedure-arity list) ; => #<struct:arity-at-least> 
(arity-at-least? (procedure-arity list)) ; => #t 
(arity-at-least-value (procedure-arity list)) ; => 0 
(arity-at-least-value (procedure-arity (lambda (x . y) x))) ; => 1 
(procedure-arity (case-lambda [(x) 0] [(x y) 1])) ; => '(1 2) 
(procedure-arity-includes? cons 2) ; => #t 
(procedure-arity-includes? display 3) ; => #f

When compiling a lambda or case-lambda expression, MzScheme looks for a 'method-arity-error property attached to the expression (see section 12.6.2). If it is present with a true value, and if no case of the procedure accepts zero arguments, then the procedure is marked so that an exn:application:arity exception involving the procedure will hide the first argument, if one was provided. (Hiding the first argument is useful when the procedure implements a method, where the first argument is implicit in the original source). The property affects only the format of exn:application:arity exceptions, not the result of procedure-arity.

3.10.2 Primitives

A primitive procedure is a built-in procedure that is implemented in low-level language. Not all built-in procedures are primitives, but almost all R5RS procedures are primitives, as are most of the procedures described in this manual.

(primitive? v) returns #t if v is a primitive procedure or #f otherwise.
(primitive-result-arity prim-proc) returns the arity of the result of the primitive procedure prim-proc (as opposed to the procedure's input arity as returned by arity; see section 3.10.1). For most primitives, this procedure returns 1, since most primitives return a single value when applied. For information about arity values, see section 3.10.1.
(primitive-closure? v) returns #t if v is internally implemented as a primitive closure rather than an simple primitive procedure, #f otherwise. This information is intended for use by the mzc compiler.

3.10.3 Procedure Names

See section 6.2.4 for information about the names of primitives, and the names inferred for lambda and case-lambda procedures.

3.11 Promises

The force procedure can only be applied to values returned by delay, and promises are never implicitly forced.

(promise? v) returns #t if v is a promise created by delay, #f otherwise.

3.12 Hash Tables

(make-hash-table [flag-symbol flag-symbol]) creates and returns a new hash table. If provided, each flag-symbol must one of the following:

'weak -- creates a hash table with weakly-held keys (see section 13.1).
'equal -- creates a hash table that compares keys using equal? instead of eq? (needed, for example, when using strings as keys).

By default, key comparisons use eq?. If the second flag-symbol is redundant, the exn:application:mismatch exception is raised.

(hash-table? v) returns #t if v was created by make-hash-table, #f otherwise.

(hash-table-put! hash-table key-v v) maps key-v to v in hash-table, overwriting any existing mapping for key-v.

(hash-table-get hash-table key-v [failure-thunk]) returns the value for key-v in hash-table. If no value is found for key-v, then the result of invoking failure-thunk (a procedure of no arguments) is returned. If failure-thunk is not provided, the exn:application:mismatch exception is raised when no value is found for key-v.

(hash-table-remove! hash-table key-v) removes the value mapping for key-v if it exists in hash-table.

(hash-table-map hash-table proc) applies the procedure proc to each element in hash-table, accumulating the results into a list. The procedure proc must take two arguments: a key and its value. See the caveat below about concurrent access.

(hash-table-for-each hash-table proc) applies the procedure proc to each element in hash-table (for the side-effects of proc) and returns void. The procedure proc must take two arguments: a key and its value. See the caveat below about concurrent access.

(eq-hash-code v) returns a number; for any two eq? values, the returned number is always the same. The number is an exact integer that is itself guaranteed to be eq? with any value representing the same exact integer (i.e., it is a fixnum).

(equal-hash-code v) returns a number; for any two equal? values, the returned number is always the same. The number is an exact integer that is itself guaranteed to be eq? with any value representing the same exact integer (i.e., it is a fixnum). If v contains a cycle consisting of pairs, vectors, boxes, and fully-inspectable structures, then equal-hash-code applied to v will loop indefinitely.

Caveat concerning concurrent access: A hash table can be manipulated with hash-table-get, hash-table-put!, and hash-table-remove! concurrently by multiple threads, and the operations are protected by a table-specific semaphore as needed. A few caveats apply, however:

If a thread is terminated while applying hash-table-get, hash-table-put!, or hash-table-remove! to a hash table that uses equal? comparisons, all current and future operations on the hash table block indefinitely.
The hash-table-map and hash-table-for-each procedures do not use the table's semaphore. Consequently, if a hash table is modified by another thread while a map for for-each is in process, arbitrary key-value pairs can be dropped or duplicated in the map or for-each.

Caveat concerning mutable keys: If a key into an equal?-based hash table is mutated (e.g., a key string is modified with string-set!), then the hash table's behavior put and get operations become unpredictable.

² 30 bits for a 32-bit architecture, 62 bits for a 64-bit architecture.

³ This definition of eqv? technically contradicts R5RS, but R5RS does not address strange ``numbers'' like +nan.0.

⁴ The random number generator uses a relatively standard Unix random() implementation in its degree-seven polynomial mode.

⁵ All fields of the arity-at-least structure type are accessible by all inspectors (see section 4.6).