\input texinfo @c -*-texinfo-*- @c %**start of header @setfilename r5rs.info @settitle Revised(5) Scheme @c \documentclass[twoside]{algol60} @c \pagestyle{headings} @c \showboxdepth=0 @c \def\headertitle{Revised$^{5}$ Scheme} @c \def\integerversion{5} @c Sizes and dimensions @c \topmargin -.375in % Nominal distance from top of page to top of @c box containing running head. @c \headsep 15pt % Space between running head and text. @c \textheight 663pt % Height of text (including footnotes and figures, @c excluding running head and foot). @c \textwidth 523pt % Width of text line. @c \columnsep 15pt % Space between columns @c \columnseprule 0pt % Width of rule between columns. @c \parskip 5pt plus 2pt minus 2pt % Extra vertical space between paragraphs. @c \parindent 0pt % Width of paragraph indentation. @c \topsep 0pt plus 2pt % Extra vertical space, in addition to @c \parskip, added above and below list and @c paragraphing environments. @c \oddsidemargin -.5in % Left margin on odd-numbered pages. @c \evensidemargin -.5in % Left margin on even-numbered pages. @c % End of sizes and dimensions @paragraphindent 0 @c %**end of header @c syncodeindex fn cp @ifinfo @dircategory The Algorithmic Language Scheme @direntry * R5RS: (r5rs). The Revised(5) Report on Scheme. @end direntry @end ifinfo @c \parindent 0pt %!! 15pt % Width of paragraph indentation. @b{20 February 1998} @c \hfil \today{} @c @include{first} @titlepage @c HTML first page @title Scheme @subtitle Revised(5) Report on the Algorithmic Language Scheme @c First page @c \thispagestyle{empty} @c \todo{"another" report?} @author R@sc{ICHARD} K@sc{ELSEY}, W@sc{ILLIAM} C@sc{LINGER, AND} J@sc{ONATHAN} R@sc{EES} (@i{Editors}) @author H. A@sc{BELSON} @author R. K. D@sc{YBVIG} @author C. T. H@sc{AYNES} @author G. J. R@sc{OZAS} @author N. I. A@sc{DAMS IV} @author D. P. F@sc{RIEDMAN} @author E. K@sc{OHLBECKER} @author G. L. S@sc{TEELE} J@sc{R}. @author D. H. B@sc{ARTLEY} @author R. H@sc{ALSTEAD} @author D. O@sc{XLEY} @author G. J. S@sc{USSMAN} @author G. B@sc{ROOKS} @author C. H@sc{ANSON} @author K. M. P@sc{ITMAN} @author M. W@sc{AND} @author @c {\it Dedicated to the Memory of ALGOL 60} @i{Dedicated to the Memory of Robert Hieb} @c [For the macros in R5RS -RK] @unnumbered Summary The report gives a defining description of the programming language Scheme. Scheme is a statically scoped and properly tail-recursive dialect of the Lisp programming language invented by Guy Lewis Steele Jr.@: and Gerald Jay Sussman. It was designed to have an exceptionally clear and simple semantics and few different ways to form expressions. A wide variety of programming paradigms, including imperative, functional, and message passing styles, find convenient expression in Scheme. The introduction offers a brief history of the language and of the report. The first three chapters present the fundamental ideas of the language and describe the notational conventions used for describing the language and for writing programs in the language. Chapters @ref{Expressions} and @ref{Program structure} describe the syntax and semantics of expressions, programs, and definitions. Chapter @ref{Standard procedures} describes Scheme's built-in procedures, which include all of the language's data manipulation and input/output primitives. Chapter @ref{Formal syntax and semantics} provides a formal syntax for Scheme written in extended BNF, along with a formal denotational semantics. An example of the use of the language follows the formal syntax and semantics. The report concludes with a list of references and an alphabetic index. @ignore todo expand the summary so that it fills up the column. @end ignore @c \vfill @c \begin{center} @c {\large \bf @c *** DRAFT*** \\ @c %August 31, 1989 @c \today @c }\end{center} @c \addvspace{3.5pt} % don't shrink this gap @c \renewcommand{\tocshrink}{-3.5pt} % value determined experimentally @page @end titlepage @c INFO first page @ifinfo @c First page @c \thispagestyle{empty} @c \todo{"another" report?} @node top, Introduction, (dir), (dir) @top Revised(5) Report on the Algorithmic Language Scheme @sp 1 @center @c begin-tabular @quotation @multitable @columnfractions 0.25 0.25 0.25 0.25 @item @center R@sc{ICHARD} K@sc{ELSEY}, W@sc{ILLIAM} C@sc{LINGER, AND} J@sc{ONATHAN} R@sc{EES} (@i{Editors}) @item H. A@sc{BELSON} @tab R. K. D@sc{YBVIG} @tab C. T. H@sc{AYNES} @tab G. J. R@sc{OZAS} @item N. I. A@sc{DAMS IV} @tab D. P. F@sc{RIEDMAN} @tab E. K@sc{OHLBECKER} @tab G. L. S@sc{TEELE} J@sc{R}. @item D. H. B@sc{ARTLEY} @tab R. H@sc{ALSTEAD} @tab D. O@sc{XLEY} @tab G. J. S@sc{USSMAN} @item G. B@sc{ROOKS} @tab C. H@sc{ANSON} @tab K. M. P@sc{ITMAN} @tab M. W@sc{AND} @item @end multitable @end quotation @sp 2 @c {\it Dedicated to the Memory of ALGOL 60} @i{Dedicated to the Memory of Robert Hieb} @c [For the macros in R5RS -RK] @sp 3 @majorheading Summary The report gives a defining description of the programming language Scheme. Scheme is a statically scoped and properly tail-recursive dialect of the Lisp programming language invented by Guy Lewis Steele Jr.@: and Gerald Jay Sussman. It was designed to have an exceptionally clear and simple semantics and few different ways to form expressions. A wide variety of programming paradigms, including imperative, functional, and message passing styles, find convenient expression in Scheme. The introduction offers a brief history of the language and of the report. The first three chapters present the fundamental ideas of the language and describe the notational conventions used for describing the language and for writing programs in the language. Chapters @ref{Expressions} and @ref{Program structure} describe the syntax and semantics of expressions, programs, and definitions. Chapter @ref{Standard procedures} describes Scheme's built-in procedures, which include all of the language's data manipulation and input/output primitives. Chapter @ref{Formal syntax and semantics} provides a formal syntax for Scheme written in extended BNF, along with a formal denotational semantics. An example of the use of the language follows the formal syntax and semantics. The report concludes with a list of references and an alphabetic index. @ignore todo expand the summary so that it fills up the column. @end ignore @c \vfill @c \begin{center} @c {\large \bf @c *** DRAFT*** \\ @c %August 31, 1989 @c \today @c }\end{center} @c \addvspace{3.5pt} % don't shrink this gap @c \renewcommand{\tocshrink}{-3.5pt} % value determined experimentally @unnumbered Contents @menu * Introduction:: * Overview of Scheme:: * Lexical conventions:: * Basic concepts:: * Expressions:: * Program structure:: * Standard procedures:: * Formal syntax and semantics:: * Notes:: * Additional material:: * Example:: * Bibliography:: * Index:: @end menu @page @end ifinfo @c @include{intro} @node Introduction, Overview of Scheme, top, top @unnumbered Introduction @menu * Background:: * Acknowledgements:: @end menu Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary. Scheme demonstrates that a very small number of rules for forming expressions, with no restrictions on how they are composed, suffice to form a practical and efficient programming language that is flexible enough to support most of the major programming paradigms in use today. @c Scheme has influenced the evolution of Lisp. Scheme was one of the first programming languages to incorporate first class procedures as in the lambda calculus, thereby proving the usefulness of static scope rules and block structure in a dynamically typed language. Scheme was the first major dialect of Lisp to distinguish procedures from lambda expressions and symbols, to use a single lexical environment for all variables, and to evaluate the operator position of a procedure call in the same way as an operand position. By relying entirely on procedure calls to express iteration, Scheme emphasized the fact that tail-recursive procedure calls are essentially goto's that pass arguments. Scheme was the first widely used programming language to embrace first class escape procedures, from which all previously known sequential control structures can be synthesized. A subsequent version of Scheme introduced the concept of exact and inexact numbers, an extension of Common Lisp's generic arithmetic. More recently, Scheme became the first programming language to support hygienic macros, which permit the syntax of a block-structured language to be extended in a consistent and reliable manner. @c A few @c of these innovations have recently been incorporated into Common Lisp, while @c others remain to be adopted. @ignore todo Ramsdell: I would like to make a few comments on presentation. The most important comment is about section organization. Newspaper writers spend most of their time writing the first three paragraphs of any article. This part of the article is often the only part read by readers, and is important in enticing readers to continue. In the same way, The first page is most likely to be the only page read by many SIGPLAN readers. If I had my choice of what I would ask them to read, it would be the material in section 1.1, the Semantics section that notes that scheme is lexically scoped, tail recursive, weakly typed, ... etc. I would expand on the discussion on continuations, as they represent one important difference between Scheme and other languages. The introduction, with its history of scheme, its history of scheme reports and meetings, and acknowledgements giving names of people that the reader will not likely know, is not that one page I would like all to read. I suggest moving the history to the back of the report, and use the first couple of pages to convince the reader that the language documented in this report is worth studying. @end ignore @node Background, Acknowledgements, Introduction, Introduction @unnumberedsec Background The first description of Scheme was written in 1975 [Scheme75]. A revised report [Scheme78] @ignore todo italicize or not? @end ignore appeared in 1978, which described the evolution of the language as its MIT implementation was upgraded to support an innovative compiler [Rabbit]. Three distinct projects began in 1981 and 1982 to use variants of Scheme for courses at MIT, Yale, and Indiana University [Rees82], [MITScheme], [Scheme311]. An introductory computer science textbook using Scheme was published in 1984 [SICP]. @c \vest As might be expected of a language used primarily for education and @c research, Scheme has always evolved rapidly. This was no problem when @c Scheme was used only within MIT, but As Scheme became more widespread, local dialects began to diverge until students and researchers occasionally found it difficult to understand code written at other sites. Fifteen representatives of the major implementations of Scheme therefore met in October 1984 to work toward a better and more widely accepted standard for Scheme. @c Participating in this workshop were Hal Abelson, Norman Adams, David @c Bartley, Gary Brooks, William Clinger, Daniel Friedman, Robert Halstead, @c Chris Hanson, Christopher Haynes, Eugene Kohlbecker, Don Oxley, Jonathan Rees, @c Guillermo Rozas, Gerald Jay Sussman, and Mitchell Wand. Kent Pitman @c made valuable contributions to the agenda for the workshop but was @c unable to attend the sessions. @c Subsequent electronic mail discussions and committee work completed the @c definition of the language. @c Gerry Sussman drafted the section on numbers, Chris Hanson drafted the @c sections on characters and strings, and Gary Brooks and William Clinger @c drafted the sections on input and output. @c William Clinger recorded the decisions of the workshop and @c compiled the pieces into a coherent document. @c The ``Revised revised report on Scheme''~\cite{RRRS} Their report [RRRS] was published at MIT and Indiana University in the summer of 1985. Further revision took place in the spring of 1986 [R3RS], @c , again accomplished @c almost entirely by electronic mail, resulted in the present report. and in the spring of 1988 [R4RS]. The present report reflects further revisions agreed upon in a meeting at Xerox PARC in June 1992. @c \vest The number 3 in the title is part of the title, not a reference to @c a footnote. The word ``revised'' is raised to the third power because @c the report is a revision of a report that was already twice revised. @ignore todo Write an editors' note? @end ignore @sp 3 We intend this report to belong to the entire Scheme community, and so we grant permission to copy it in whole or in part without fee. In particular, we encourage implementors of Scheme to use this report as a starting point for manuals and other documentation, modifying it as necessary. @node Acknowledgements, , Background, Introduction @unnumberedsec Acknowledgements We would like to thank the following people for their help: Alan Bawden, Michael Blair, George Carrette, Andy Cromarty, Pavel Curtis, Jeff Dalton, Olivier Danvy, Ken Dickey, Bruce Duba, Marc Feeley, Andy Freeman, Richard Gabriel, Yekta G"ursel, Ken Haase, Robert Hieb, Paul Hudak, Morry Katz, Chris Lindblad, Mark Meyer, Jim Miller, Jim Philbin, John Ramsdell, Mike Shaff, Jonathan Shapiro, Julie Sussman, Perry Wagle, Daniel Weise, Henry Wu, and Ozan Yigit. We thank Carol Fessenden, Daniel Friedman, and Christopher Haynes for permission to use text from the Scheme 311 version 4 reference manual. We thank Texas Instruments, Inc. for permission to use text from the @emph{TI Scheme Language Reference Manual}[TImanual85]. We gladly acknowledge the influence of manuals for MIT Scheme[MITScheme], T[Rees84], Scheme 84[Scheme84],Common Lisp[CLtL], and Algol 60[Naur63]. We also thank Betty Dexter for the extreme effort she put into setting this report in @TeX{}, and Donald Knuth for designing the program that caused her troubles. The Artificial Intelligence Laboratory of the Massachusetts Institute of Technology, the Computer Science Department of Indiana University, the Computer and Information Sciences Department of the University of Oregon, and the NEC Research Institute supported the preparation of this report. Support for the MIT work was provided in part by the Advanced Research Projects Agency of the Department of Defense under Office of Naval Research contract N00014-80-C-0505. Support for the Indiana University work was provided by NSF grants NCS 83-04567 and NCS 83-03325. @sp 2 @c \clearchapterstar{Description of the language} %\unskip\vskip -2ex @c @include{struct} @c 1. Structure of the language @node Overview of Scheme, Lexical conventions, Introduction, top @chapter Overview of Scheme @menu * Semantics:: * Syntax:: * Notation and terminology:: @end menu @node Semantics, Syntax, Overview of Scheme, Overview of Scheme @section Semantics This section gives an overview of Scheme's semantics. A detailed informal semantics is the subject of chapters @ref{Basic concepts} through @ref{Standard procedures}. For reference purposes, section @ref{Formal semantics} provides a formal semantics of Scheme. Following Algol, Scheme is a statically scoped programming language. Each use of a variable is associated with a lexically apparent binding of that variable. Scheme has latent as opposed to manifest types. Types are associated with values (also called objects) rather than @cindex @w{object} with variables. (Some authors refer to languages with latent types as weakly typed or dynamically typed languages.) Other languages with latent types are APL, Snobol, and other dialects of Lisp. Languages with manifest types (sometimes referred to as strongly typed or statically typed languages) include Algol 60, Pascal, and C. All objects created in the course of a Scheme computation, including procedures and continuations, have unlimited extent. No Scheme object is ever destroyed. The reason that implementations of Scheme do not (usually!) run out of storage is that they are permitted to reclaim the storage occupied by an object if they can prove that the object cannot possibly matter to any future computation. Other languages in which most objects have unlimited extent include APL and other Lisp dialects. Implementations of Scheme are required to be properly tail-recursive. This allows the execution of an iterative computation in constant space, even if the iterative computation is described by a syntactically recursive procedure. Thus with a properly tail-recursive implementation, iteration can be expressed using the ordinary procedure-call mechanics, so that special iteration constructs are useful only as syntactic sugar. See section @ref{Proper tail recursion}. Scheme procedures are objects in their own right. Procedures can be created dynamically, stored in data structures, returned as results of procedures, and so on. Other languages with these properties include Common Lisp and ML. @ignore todo Rozas: Scheme had them first. @end ignore One distinguishing feature of Scheme is that continuations, which in most other languages only operate behind the scenes, also have ``first-class'' status. Continuations are useful for implementing a wide variety of advanced control constructs, including non-local exits, backtracking, and coroutines. See section @ref{Control features}. Arguments to Scheme procedures are always passed by value, which means that the actual argument expressions are evaluated before the procedure gains control, whether the procedure needs the result of the evaluation or not. ML, C, and APL are three other languages that always pass arguments by value. This is distinct from the lazy-evaluation semantics of Haskell, or the call-by-name semantics of Algol 60, where an argument expression is not evaluated unless its value is needed by the procedure. @ignore todo Lisp's call by value should be explained more accurately. What's funny is that all values are references. @end ignore Scheme's model of arithmetic is designed to remain as independent as possible of the particular ways in which numbers are represented within a computer. In Scheme, every integer is a rational number, every rational is a real, and every real is a complex number. Thus the distinction between integer and real arithmetic, so important to many programming languages, does not appear in Scheme. In its place is a distinction between exact arithmetic, which corresponds to the mathematical ideal, and inexact arithmetic on approximations. As in Common Lisp, exact arithmetic is not limited to integers. @node Syntax, Notation and terminology, Semantics, Overview of Scheme @section Syntax Scheme, like most dialects of Lisp, employs a fully parenthesized prefix notation for programs and (other) data; the grammar of Scheme generates a sublanguage of the language used for data. An important consequence of this simple, uniform representation is the susceptibility of Scheme programs and data to uniform treatment by other Scheme programs. For example, the @samp{eval} procedure evaluates a Scheme program expressed as data. The @samp{read} procedure performs syntactic as well as lexical decomposition of the data it reads. The @samp{read} procedure parses its input as data (section @pxref{External representation}), not as program. The formal syntax of Scheme is described in section @ref{Formal syntax}. @node Notation and terminology, , Syntax, Overview of Scheme @section Notation and terminology @menu * Primitive; library; and optional features:: * Error situations and unspecified behavior:: * Entry format:: * Evaluation examples:: * Naming conventions:: @end menu @node Primitive; library; and optional features, Error situations and unspecified behavior, Notation and terminology, Notation and terminology @subsection Primitive; library; and optional features It is required that every implementation of Scheme support all features that are not marked as being @dfn{optional}. Implementations are @cindex @w{optional} free to omit optional features of Scheme or to add extensions, provided the extensions are not in conflict with the language reported here. In particular, implementations must support portable code by providing a syntactic mode that preempts no lexical conventions of this report. To aid in understanding and implementing Scheme, some features are marked as @dfn{library}. These can be easily implemented in terms of the other, @cindex @w{library} primitive, features. They are redundant in the strict sense of the word, but they capture common patterns of usage, and are therefore provided as convenient abbreviations. @node Error situations and unspecified behavior, Entry format, Primitive; library; and optional features, Notation and terminology @subsection Error situations and unspecified behavior @cindex @w{error} When speaking of an error situation, this report uses the phrase ``an error is signalled'' to indicate that implementations must detect and report the error. If such wording does not appear in the discussion of an error, then implementations are not required to detect or report the error, though they are encouraged to do so. An error situation that implementations are not required to detect is usually referred to simply as ``an error.'' For example, it is an error for a procedure to be passed an argument that the procedure is not explicitly specified to handle, even though such domain errors are seldom mentioned in this report. Implementations may extend a procedure's domain of definition to include such arguments. This report uses the phrase ``may report a violation of an implementation restriction'' to indicate circumstances under which an implementation is permitted to report that it is unable to continue execution of a correct program because of some restriction imposed by the implementation. Implementation restrictions are of course discouraged, but implementations are encouraged to report violations of implementation restrictions. @cindex @w{implementation restriction} For example, an implementation may report a violation of an implementation restriction if it does not have enough storage to run a program. If the value of an expression is said to be ``unspecified,'' then the expression must evaluate to some object without signalling an error, but the value depends on the implementation; this report explicitly does not say what value should be returned. @cindex @w{unspecified} @ignore todo Talk about unspecified behavior vs. unspecified values. @end ignore @ignore todo Look at KMP's situations paper. @end ignore @node Entry format, Evaluation examples, Error situations and unspecified behavior, Notation and terminology @subsection Entry format Chapters @ref{Expressions} and @ref{Standard procedures} are organized into entries. Each entry describes one language feature or a group of related features, where a feature is either a syntactic construct or a built-in procedure. An entry begins with one or more header lines of the form @noindent @deffn {@var{category}} @var{template} @end deffn for required, primitive features, or @noindent @deffn {@var{qualifier} @var{category}} @var{template} @end deffn where @var{qualifier} is either ``library'' or ``optional'' as defined in section @ref{Primitive; library; and optional features}. If @var{category} is ``syntax'', the entry describes an expression type, and the template gives the syntax of the expression type. Components of expressions are designated by syntactic variables, which are written using angle brackets, for example, @r{}, @r{}. Syntactic variables should be understood to denote segments of program text; for example, @r{} stands for any string of characters which is a syntactically valid expression. The notation @format @r{} @dots{} @end format indicates zero or more occurrences of a @r{}, and @format @r{} @r{} @dots{} @end format indicates one or more occurrences of a @r{}. If @var{category} is ``procedure'', then the entry describes a procedure, and the header line gives a template for a call to the procedure. Argument names in the template are @var{italicized}. Thus the header line @noindent @deffn {procedure} (vector-ref @var{vector} @var{k}) @end deffn indicates that the built-in procedure @t{vector-ref} takes two arguments, a vector @var{vector} and an exact non-negative integer @var{k} (see below). The header lines @noindent @deffn {procedure} (make-vector @var{k}) @deffnx {procedure} (make-vector @var{k} @var{fill}) @end deffn indicate that the @t{make-vector} procedure must be defined to take either one or two arguments. It is an error for an operation to be presented with an argument that it is not specified to handle. For succinctness, we follow the convention that if an argument name is also the name of a type listed in section @ref{Disjointness of types}, then that argument must be of the named type. For example, the header line for @t{vector-ref} given above dictates that the first argument to @t{vector-ref} must be a vector. The following naming conventions also imply type restrictions: @c \newcommand{\foo}[1]{\vr{#1}, \vri{#1}, $\ldots$ \vrj{#1}, $\ldots$} @center @c begin-tabular @quotation @table @asis @item @var{obj} any object @item @var{list}, @var{list1}, @dots{} @var{listj}, @dots{} list (see section @pxref{Pairs and lists}) @item @var{z}, @var{z1}, @dots{} @var{zj}, @dots{} complex number @item @var{x}, @var{x1}, @dots{} @var{xj}, @dots{} real number @item @var{y}, @var{y1}, @dots{} @var{yj}, @dots{} real number @item @var{q}, @var{q1}, @dots{} @var{qj}, @dots{} rational number @item @var{n}, @var{n1}, @dots{} @var{nj}, @dots{} integer @item @var{k}, @var{k1}, @dots{} @var{kj}, @dots{} exact non-negative integer @item @end table @end quotation @ignore todo Provide an example entry?? @end ignore @node Evaluation examples, Naming conventions, Entry format, Notation and terminology @subsection Evaluation examples The symbol ``@result{}'' used in program examples should be read ``evaluates to.'' For example, @example (* 5 8) ==> 40 @end example means that the expression @t{(* 5 8)} evaluates to the object @t{40}. Or, more precisely: the expression given by the sequence of characters ``@t{(* 5 8)}'' evaluates, in the initial environment, to an object that may be represented externally by the sequence of characters ``@t{40}''. See section @ref{External representations} for a discussion of external representations of objects. @node Naming conventions, , Evaluation examples, Notation and terminology @subsection Naming conventions By convention, the names of procedures that always return a boolean value usually end in ``@code{?}''. Such procedures are called predicates. @vindex @w{?} By convention, the names of procedures that store values into previously allocated locations (see section @pxref{Storage model}) usually end in ``@code{!}''. @vindex @w{!} Such procedures are called mutation procedures. By convention, the value returned by a mutation procedure is unspecified. By convention, ``@code{->}'' appears within the names of procedures that @vindex @w{->} take an object of one type and return an analogous object of another type. For example, @samp{list->vector} takes a list and returns a vector whose elements are the same as those of the list. @ignore todo Terms that need defining: thunk, command (what else?). @end ignore @c @include{lex} @c Lexical structure @c %\vfill\eject @node Lexical conventions, Basic concepts, Overview of Scheme, top @chapter Lexical conventions @menu * Identifiers:: * Whitespace and comments:: * Other notations:: @end menu This section gives an informal account of some of the lexical conventions used in writing Scheme programs. For a formal syntax of Scheme, see section @ref{Formal syntax}. Upper and lower case forms of a letter are never distinguished except within character and string constants. For example, @samp{Foo} is the same identifier as @samp{FOO}, and @t{#x1AB} is the same number as @t{#X1ab}. @node Identifiers, Whitespace and comments, Lexical conventions, Lexical conventions @section Identifiers Most identifiers allowed by other programming @cindex @w{identifier} languages are also acceptable to Scheme. The precise rules for forming identifiers vary among implementations of Scheme, but in all implementations a sequence of letters, digits, and ``extended alphabetic characters'' that begins with a character that cannot begin a number is an identifier. In addition, @code{+}, @code{-}, and @code{...} are identifiers. @vindex @w{...} @vindex @w{-} @vindex @w{+} Here are some examples of identifiers: @example lambda q list->vector soup + V17a <=? a34kTMNs the-word-recursion-has-many-meanings @end example Extended alphabetic characters may be used within identifiers as if they were letters. The following are extended alphabetic characters: @example ! $ % & * + - . / : < = > ? @@ ^ _ ~ @end example See section @ref{Lexical structure} for a formal syntax of identifiers. Identifiers have two uses within Scheme programs: @itemize @bullet @item Any identifier may be used as a variable or as a syntactic keyword (see sections @pxref{Variables; syntactic keywords; and regions} and @pxref{Macros}). @item When an identifier appears as a literal or within a literal (see section @pxref{Literal expressions}), it is being used to denote a @emph{symbol} (see section @pxref{Symbols}). @end itemize @cindex @w{syntactic keyword} @cindex @w{variable} @c \label{keywordsection} @c The following identifiers are syntactic keywords, and should not be used @c as variables: @c \begin{scheme} @c => do or @c and else quasiquote @c begin if quote @c case lambda set! @c cond let unquote @c define let* unquote-splicing @c delay letrec% @c \end{scheme} @c Some implementations allow all identifiers, including syntactic @c keywords, to be used as variables. This is a compatible extension to @c the language, but ambiguities in the language result when the @c restriction is relaxed, and the ways in which these ambiguities are @c resolved vary between implementations. @node Whitespace and comments, Other notations, Identifiers, Lexical conventions @section Whitespace and comments @dfn{Whitespace} characters are spaces and newlines. @cindex @w{Whitespace} (Implementations typically provide additional whitespace characters such as tab or page break.) Whitespace is used for improved readability and as necessary to separate tokens from each other, a token being an indivisible lexical unit such as an identifier or number, but is otherwise insignificant. Whitespace may occur between any two tokens, but not within a token. Whitespace may also occur inside a string, where it is significant. A semicolon (@t{;}) indicates the start of a comment. The comment continues to the @cindex @w{;} @cindex @w{comment} end of the line on which the semicolon appears. Comments are invisible to Scheme, but the end of the line is visible as whitespace. This prevents a comment from appearing in the middle of an identifier or number. @example ;;; The FACT procedure computes the factorial ;;; of a non-negative integer. (define fact (lambda (n) (if (= n 0) 1 ;Base case: return 1 (* n (fact (- n 1)))))) @end example @node Other notations, , Whitespace and comments, Lexical conventions @section Other notations @ignore todo Rewrite? @end ignore For a description of the notations used for numbers, see section @ref{Numbers}. @table @t @item @t{.@: + -} These are used in numbers, and may also occur anywhere in an identifier except as the first character. A delimited plus or minus sign by itself is also an identifier. A delimited period (not occurring within a number or identifier) is used in the notation for pairs (section @pxref{Pairs and lists}), and to indicate a rest-parameter in a formal parameter list (section @pxref{Procedures}). A delimited sequence of three successive periods is also an identifier. @item @t{( )} Parentheses are used for grouping and to notate lists (section @pxref{Pairs and lists}). @item @t{'} The single quote character is used to indicate literal data (section @pxref{Literal expressions}). @item @t{`} The backquote character is used to indicate almost-constant data (section @pxref{Quasiquotation}). @item @t{, ,@@} The character comma and the sequence comma at-sign are used in conjunction with backquote (section @pxref{Quasiquotation}). @item @t{"} The double quote character is used to delimit strings (section @pxref{Strings}). @item \ Backslash is used in the syntax for character constants (section @pxref{Characters}) and as an escape character within string constants (section @pxref{Strings}). @c A box used because \verb is not allowed in command arguments. @item @w{@t{[ ] @{ @} |}} Left and right square brackets and curly braces and vertical bar are reserved for possible future extensions to the language. @item # Sharp sign is used for a variety of purposes depending on the character that immediately follows it: @item @t{#t} @t{#f} These are the boolean constants (section @pxref{Booleans}). @item #\ This introduces a character constant (section @pxref{Characters}). @item #@t{(} This introduces a vector constant (section @pxref{Vectors}). Vector constants are terminated by @t{)} . @item @t{#e #i #b #o #d #x} These are used in the notation for numbers (section @pxref{Syntax of numerical constants}). @end table @c @include{basic} @c \vfill\eject @node Basic concepts, Expressions, Lexical conventions, top @chapter Basic concepts @menu * Variables; syntactic keywords; and regions:: * Disjointness of types:: * External representations:: * Storage model:: * Proper tail recursion:: @end menu @node Variables; syntactic keywords; and regions, Disjointness of types, Basic concepts, Basic concepts @section Variables; syntactic keywords; and regions An identifier may name a type of syntax, or it may name @cindex @w{identifier} a location where a value can be stored. An identifier that names a type of syntax is called a @emph{syntactic keyword} @cindex @w{syntactic keyword} and is said to be @emph{bound} to that syntax. An identifier that names a location is called a @emph{variable} and is said to be @cindex @w{variable} @emph{bound} to that location. The set of all visible bindings in effect at some point in a program is @cindex @w{binding} known as the @emph{environment} in effect at that point. The value stored in the location to which a variable is bound is called the variable's value. By abuse of terminology, the variable is sometimes said to name the value or to be bound to the value. This is not quite accurate, but confusion rarely results from this practice. @ignore todo Define ``assigned'' and ``unassigned'' perhaps? @end ignore @ignore todo In programs without side effects, one can safely pretend that the variables are bound directly to the arguments. Or: In programs without @code{set!}, one can safely pretend that the @vindex @w{set!} variable is bound directly to the value. @end ignore Certain expression types are used to create new kinds of syntax and bind syntactic keywords to those new syntaxes, while other expression types create new locations and bind variables to those locations. These expression types are called @emph{binding constructs}. @cindex @w{binding construct} Those that bind syntactic keywords are listed in section @ref{Macros}. The most fundamental of the variable binding constructs is the @samp{lambda} expression, because all other variable binding constructs can be explained in terms of @samp{lambda} expressions. The other variable binding constructs are @samp{let}, @samp{let*}, @samp{letrec}, and @samp{do} expressions (see sections @pxref{Procedures}, @pxref{Binding constructs}, and @pxref{Iteration}). @c Note: internal definitions not mentioned here. Like Algol and Pascal, and unlike most other dialects of Lisp except for Common Lisp, Scheme is a statically scoped language with block structure. To each place where an identifier is bound in a program there corresponds a @dfn{region} of the program text within which @cindex @w{region} the binding is visible. The region is determined by the particular binding construct that establishes the binding; if the binding is established by a @samp{lambda} expression, for example, then its region is the entire @samp{lambda} expression. Every mention of an identifier refers to the binding of the identifier that established the innermost of the regions containing the use. If there is no binding of the identifier whose region contains the use, then the use refers to the binding for the variable in the top level environment, if any (chapters @pxref{Expressions} and @pxref{Standard procedures}); if there is no binding for the identifier, it is said to be @dfn{unbound}. @cindex @w{top level environment} @cindex @w{bound} @cindex @w{unbound} @ignore todo Mention that some implementations have multiple top level environments? @end ignore @ignore todo Pitman sez: needs elaboration in case of @t{(let ...)} @end ignore @ignore todo Pitman asks: say something about vars created after scheme starts? @t{(define x 3) (define (f) x) (define (g) y) (define y 4)} Clinger replies: The language was explicitly designed to permit a view in which no variables are created after Scheme starts. In files, you can scan out the definitions beforehand. I think we're agreed on the principle that interactive use should approximate that behavior as closely as possible, though we don't yet agree on which programming environment provides the best approximation. @end ignore @node Disjointness of types, External representations, Variables; syntactic keywords; and regions, Basic concepts @section Disjointness of types No object satisfies more than one of the following predicates: @example boolean? pair? symbol? number? char? string? vector? port? procedure? @end example These predicates define the types @emph{boolean}, @emph{pair}, @emph{symbol}, @emph{number}, @emph{char} (or @emph{character}), @emph{string}, @emph{vector}, @emph{port}, and @emph{procedure}. The empty list is a special object of its own type; it satisfies none of the above predicates. @vindex symbol? @vindex pair? @vindex boolean? @cindex @w{type} @vindex vector? @vindex string? @vindex char? @vindex number? @cindex @w{empty list} @vindex procedure? @vindex port? Although there is a separate boolean type, any Scheme value can be used as a boolean value for the purpose of a conditional test. As explained in section @ref{Booleans}, all values count as true in such a test except for @t{#f}. @c and possibly the empty list. @c The only value that is guaranteed to count as @c false is \schfalse{}. It is explicitly unspecified whether the empty list @c counts as true or as false. This report uses the word ``true'' to refer to any Scheme value except @t{#f}, and the word ``false'' to refer to @t{#f}. @cindex @w{false} @cindex @w{true} @node External representations, Storage model, Disjointness of types, Basic concepts @section External representations An important concept in Scheme (and Lisp) is that of the @emph{external representation} of an object as a sequence of characters. For example, an external representation of the integer 28 is the sequence of characters ``@t{28}'', and an external representation of a list consisting of the integers 8 and 13 is the sequence of characters ``@t{(8 13)}''. The external representation of an object is not necessarily unique. The integer 28 also has representations ``@t{#e28.000}'' and ``@t{#x1c}'', and the list in the previous paragraph also has the representations ``@t{( 08 13 )}'' and ``@t{(8 .@: (13 .@: ()))}'' (see section @pxref{Pairs and lists}). Many objects have standard external representations, but some, such as procedures, do not have standard representations (although particular implementations may define representations for them). An external representation may be written in a program to obtain the corresponding object (see @samp{quote}, section @pxref{Literal expressions}). External representations can also be used for input and output. The procedure @samp{read} (section @pxref{Input}) parses external representations, and the procedure @samp{write} (section @pxref{Output}) generates them. Together, they provide an elegant and powerful input/output facility. Note that the sequence of characters ``@t{(+ 2 6)}'' is @emph{not} an external representation of the integer 8, even though it @emph{is} an expression evaluating to the integer 8; rather, it is an external representation of a three-element list, the elements of which are the symbol @t{+} and the integers 2 and 6. Scheme's syntax has the property that any sequence of characters that is an expression is also the external representation of some object. This can lead to confusion, since it may not be obvious out of context whether a given sequence of characters is intended to denote data or program, but it is also a source of power, since it facilitates writing programs such as interpreters and compilers that treat programs as data (or vice versa). The syntax of external representations of various kinds of objects accompanies the description of the primitives for manipulating the objects in the appropriate sections of chapter @ref{Standard procedures}. @node Storage model, Proper tail recursion, External representations, Basic concepts @section Storage model Variables and objects such as pairs, vectors, and strings implicitly denote locations or sequences of locations. A string, for @cindex @w{location} example, denotes as many locations as there are characters in the string. (These locations need not correspond to a full machine word.) A new value may be stored into one of these locations using the @t{string-set!} procedure, but the string continues to denote the same locations as before. An object fetched from a location, by a variable reference or by a procedure such as @samp{car}, @samp{vector-ref}, or @samp{string-ref}, is equivalent in the sense of @code{eqv?} @c and \ide{eq?} ?? (section @pxref{Equivalence predicates}) @vindex @w{eqv?} to the object last stored in the location before the fetch. Every location is marked to show whether it is in use. No variable or object ever refers to a location that is not in use. Whenever this report speaks of storage being allocated for a variable or object, what is meant is that an appropriate number of locations are chosen from the set of locations that are not in use, and the chosen locations are marked to indicate that they are now in use before the variable or object is made to denote them. In many systems it is desirable for constants (i.e. the values of @cindex @w{constant} literal expressions) to reside in read-only-memory. To express this, it is convenient to imagine that every object that denotes locations is associated with a flag telling whether that object is mutable or @cindex @w{mutable} immutable. In such systems literal constants and the strings @cindex @w{immutable} returned by @code{symbol->string} are immutable objects, while all objects @vindex @w{symbol->string} created by the other procedures listed in this report are mutable. It is an error to attempt to store a new value into a location that is denoted by an immutable object. @node Proper tail recursion, , Storage model, Basic concepts @section Proper tail recursion Implementations of Scheme are required to be @emph{properly tail-recursive}. @cindex @w{proper tail recursion} Procedure calls that occur in certain syntactic contexts defined below are `tail calls'. A Scheme implementation is properly tail-recursive if it supports an unbounded number of active tail calls. A call is @emph{active} if the called procedure may still return. Note that this includes calls that may be returned from either by the current continuation or by continuations captured earlier by @samp{call-with-current-continuation} that are later invoked. In the absence of captured continuations, calls could return at most once and the active calls would be those that had not yet returned. A formal definition of proper tail recursion can be found in [propertailrecursion]. @quotation @emph{Rationale:} Intuitively, no space is needed for an active tail call because the continuation that is used in the tail call has the same semantics as the continuation passed to the procedure containing the call. Although an improper implementation might use a new continuation in the call, a return to this new continuation would be followed immediately by a return to the continuation passed to the procedure. A properly tail-recursive implementation returns to that continuation directly. Proper tail recursion was one of the central ideas in Steele and Sussman's original version of Scheme. Their first Scheme interpreter implemented both functions and actors. Control flow was expressed using actors, which differed from functions in that they passed their results on to another actor instead of returning to a caller. In the terminology of this section, each actor finished with a tail call to another actor. Steele and Sussman later observed that in their interpreter the code for dealing with actors was identical to that for functions and thus there was no need to include both in the language. @end quotation A @emph{tail call} is a procedure call that occurs @cindex @w{tail call} in a @emph{tail context}. Tail contexts are defined inductively. Note that a tail context is always determined with respect to a particular lambda expression. @itemize @bullet @item The last expression within the body of a lambda expression, shown as @r{} below, occurs in a tail context. @format @t{(lambda * * ) } @end format @item If one of the following expressions is in a tail context, then the subexpressions shown as are in a tail context. These were derived from rules in the grammar given in chapter @ref{Formal syntax and semantics} by replacing some occurrences of with . Only those rules that contain tail contexts are shown here. @format @t{(if ) (if ) (cond +) (cond * (else )) (case +) (case * (else )) (and * ) (or * ) (let (*) ) (let (*) ) (let* (*) ) (letrec (*) ) (let-syntax (*) ) (letrec-syntax (*) ) (begin ) (do (*) ( ) *) @r{where} --> ( ) --> ((*) ) --> * --> * } @end format @item If a @samp{cond} expression is in a tail context, and has a clause of the form @samp{(@r{} => @r{})} then the (implied) call to the procedure that results from the evaluation of @r{} is in a tail context. @r{} itself is not in a tail context. @end itemize Certain built-in procedures are also required to perform tail calls. The first argument passed to @code{apply} and to @vindex @w{apply} @code{call-with-current-continuation}, and the second argument passed to @vindex @w{call-with-current-continuation} @code{call-with-values}, must be called via a tail call. @vindex @w{call-with-values} Similarly, @code{eval} must evaluate its argument as if it @vindex @w{eval} were in tail position within the @code{eval} procedure. @vindex @w{eval} In the following example the only tail call is the call to @samp{f}. None of the calls to @samp{g} or @samp{h} are tail calls. The reference to @samp{x} is in a tail context, but it is not a call and thus is not a tail call. @example (lambda () (if (g) (let ((x (h))) x) (and (g) (f)))) @end example @quotation @emph{Note:} Implementations are allowed, but not required, to recognize that some non-tail calls, such as the call to @samp{h} above, can be evaluated as though they were tail calls. In the example above, the @samp{let} expression could be compiled as a tail call to @samp{h}. (The possibility of @samp{h} returning an unexpected number of values can be ignored, because in that case the effect of the @samp{let} is explicitly unspecified and implementation-dependent.) @end quotation @c @include{expr} @c \vfill\eject @node Expressions, Program structure, Basic concepts, top @chapter Expressions @menu * Primitive expression types:: * Derived expression types:: * Macros:: @end menu @c \newcommand{\syntax}{{\em Syntax: }} @c \newcommand{\semantics}{{\em Semantics: }} @c [Deleted for R5RS because of multiple-value returns. -RK] @c A Scheme expression is a construct that returns a value, such as a @c variable reference, literal, procedure call, or conditional. Expression types are categorized as @emph{primitive} or @emph{derived}. Primitive expression types include variables and procedure calls. Derived expression types are not semantically primitive, but can instead be defined as macros. With the exception of @samp{quasiquote}, whose macro definition is complex, the derived expressions are classified as library features. Suitable definitions are given in section @ref{Derived expression type}. @node Primitive expression types, Derived expression types, Expressions, Expressions @section Primitive expression types @menu * Variable references:: * Literal expressions:: * Procedure calls:: * Procedures:: * Conditionals:: * Assignments:: @end menu @node Variable references, Literal expressions, Primitive expression types, Primitive expression types @subsection Variable references @deffn {syntax} @r{} An expression consisting of a variable @cindex @w{variable} (section @pxref{Variables; syntactic keywords; and regions}) is a variable reference. The value of the variable reference is the value stored in the location to which the variable is bound. It is an error to reference an unbound variable. @cindex @w{unbound} @format @t{(define x 28) x ==> 28 } @end format @end deffn @node Literal expressions, Procedure calls, Variable references, Primitive expression types @subsection Literal expressions @deffn {syntax} quote @r{} @deffnx {syntax} @t{'}@r{} @deffnx {syntax} @r{} @samp{(quote @r{})} evaluates to @r{}. @cindex @w{'} @r{} may be any external representation of a Scheme object (see section @pxref{External representations}). This notation is used to include literal constants in Scheme code. @format @t{ (quote a) ==> a (quote #(a b c)) ==> #(a b c) (quote (+ 1 2)) ==> (+ 1 2) } @end format @samp{(quote @r{})} may be abbreviated as @t{'}@r{}. The two notations are equivalent in all respects. @format @t{'a ==> a '#(a b c) ==> #(a b c) '() ==> () '(+ 1 2) ==> (+ 1 2) '(quote a) ==> (quote a) ''a ==> (quote a) } @end format Numerical constants, string constants, character constants, and boolean constants evaluate ``to themselves''; they need not be quoted. @format @t{'"abc" ==> "abc" "abc" ==> "abc" '145932 ==> 145932 145932 ==> 145932 '#t ==> #t #t ==> #t } @end format As noted in section @ref{Storage model}, it is an error to alter a constant (i.e. the value of a literal expression) using a mutation procedure like @samp{set-car!} or @samp{string-set!}. @end deffn @node Procedure calls, Procedures, Literal expressions, Primitive expression types @subsection Procedure calls @deffn {syntax} (@r{} @r{} @dots{},) A procedure call is written by simply enclosing in parentheses expressions for the procedure to be called and the arguments to be passed to it. The operator and operand expressions are evaluated (in an unspecified order) and the resulting procedure is passed the resulting arguments. @cindex @w{procedure call} @cindex @w{call} @format @t{ (+ 3 4) ==> 7 ((if #f + *) 3 4) ==> 12 } @end format A number of procedures are available as the values of variables in the initial environment; for example, the addition and multiplication procedures in the above examples are the values of the variables @samp{+} and @samp{*}. New procedures are created by evaluating lambda expressions (see section @pxref{Procedures}). @ignore todo At Friedman's request, flushed mention of other ways. @end ignore @c or definitions (see section~\ref{define}). Procedure calls may return any number of values (see @code{values} in @vindex @w{values} section @pxref{Control features}). With the exception of @samp{values} the procedures available in the initial environment return one value or, for procedures such as @samp{apply}, pass on the values returned by a call to one of their arguments. Procedure calls are also called @emph{combinations}. @cindex @w{combination} @quotation @emph{Note:} In contrast to other dialects of Lisp, the order of evaluation is unspecified, and the operator expression and the operand expressions are always evaluated with the same evaluation rules. @end quotation @quotation @emph{Note:} Although the order of evaluation is otherwise unspecified, the effect of any concurrent evaluation of the operator and operand expressions is constrained to be consistent with some sequential order of evaluation. The order of evaluation may be chosen differently for each procedure call. @end quotation @quotation @emph{Note:} In many dialects of Lisp, the empty combination, @t{()}, is a legitimate expression. In Scheme, combinations must have at least one subexpression, so @t{()} is not a syntactically valid expression. @ignore todo Dybvig: ``it should be obvious from the syntax.'' @end ignore @end quotation @ignore todo Freeman: I think an explanation as to why evaluation order is not specified should be included. It should not include any reference to parallel evaluation. Does any existing compiler generate better code because the evaluation order is unspecified? Clinger: yes: T3, MacScheme v2, probably MIT Scheme and Chez Scheme. But that's not the main reason for leaving the order unspecified. @end ignore @end deffn @node Procedures, Conditionals, Procedure calls, Primitive expression types @subsection Procedures @deffn {syntax} lambda @r{} @r{} @emph{Syntax:} @r{} should be a formal arguments list as described below, and @r{} should be a sequence of one or more expressions. @emph{Semantics:} A lambda expression evaluates to a procedure. The environment in effect when the lambda expression was evaluated is remembered as part of the procedure. When the procedure is later called with some actual arguments, the environment in which the lambda expression was evaluated will be extended by binding the variables in the formal argument list to fresh locations, the corresponding actual argument values will be stored in those locations, and the expressions in the body of the lambda expression will be evaluated sequentially in the extended environment. The result(s) of the last expression in the body will be returned as the result(s) of the procedure call. @format @t{(lambda (x) (+ x x)) ==> @emph{}a procedure ((lambda (x) (+ x x)) 4) ==> 8 (define reverse-subtract (lambda (x y) (- y x))) (reverse-subtract 7 10) ==> 3 (define add4 (let ((x 4)) (lambda (y) (+ x y)))) (add4 6) ==> 10 } @end format @r{} should have one of the following forms: @itemize @bullet @item @t{(@r{} @dots{},)}: The procedure takes a fixed number of arguments; when the procedure is called, the arguments will be stored in the bindings of the corresponding variables. @item @r{}: The procedure takes any number of arguments; when the procedure is called, the sequence of actual arguments is converted into a newly allocated list, and the list is stored in the binding of the @r{}. @item @t{(@r{} @dots{}, @r{} @b{.} @r{})}: If a space-delimited period precedes the last variable, then the procedure takes n or more arguments, where n is the number of formal arguments before the period (there must be at least one). The value stored in the binding of the last variable will be a newly allocated list of the actual arguments left over after all the other actual arguments have been matched up against the other formal arguments. @end itemize It is an error for a @r{} to appear more than once in @r{}. @format @t{((lambda x x) 3 4 5 6) ==> (3 4 5 6) ((lambda (x y . z) z) 3 4 5 6) ==> (5 6) } @end format Each procedure created as the result of evaluating a lambda expression is (conceptually) tagged with a storage location, in order to make @code{eqv?} and @vindex @w{eqv?} @code{eq?} work on procedures (see section @pxref{Equivalence predicates}). @vindex @w{eq?} @end deffn @node Conditionals, Assignments, Procedures, Primitive expression types @subsection Conditionals @deffn {syntax} if @r{} @r{} @r{} @deffnx {syntax} if @r{} @r{} @c \/ if hyper = italic @emph{Syntax:} @r{}, @r{}, and @r{} may be arbitrary expressions. @emph{Semantics:} An @samp{if} expression is evaluated as follows: first, @r{} is evaluated. If it yields a true value (see @cindex @w{true} section @pxref{Booleans}), then @r{} is evaluated and its value(s) is(are) returned. Otherwise @r{} is evaluated and its value(s) is(are) returned. If @r{} yields a false value and no @r{} is specified, then the result of the expression is unspecified. @format @t{(if (> 3 2) 'yes 'no) ==> yes (if (> 2 3) 'yes 'no) ==> no (if (> 3 2) (- 3 2) (+ 3 2)) ==> 1 } @end format @end deffn @node Assignments, , Conditionals, Primitive expression types @subsection Assignments @deffn {syntax} set! @r{} @r{} @r{} is evaluated, and the resulting value is stored in the location to which @r{} is bound. @r{} must be bound either in some region enclosing the @samp{set!} expression @cindex @w{region} or at top level. The result of the @samp{set!} expression is unspecified. @format @t{(define x 2) (+ x 1) ==> 3 (set! x 4) ==> @emph{unspecified} (+ x 1) ==> 5 } @end format @end deffn @node Derived expression types, Macros, Primitive expression types, Expressions @section Derived expression types @menu * Conditional:: * Binding constructs:: * Sequencing:: * Iteration:: * Delayed evaluation:: * Quasiquotation:: @end menu The constructs in this section are hygienic, as discussed in section @ref{Macros}. For reference purposes, section @ref{Derived expression type} gives macro definitions that will convert most of the constructs described in this section into the primitive constructs described in the previous section. @ignore todo Mention that no definition of backquote is provided? @end ignore @node Conditional, Binding constructs, Derived expression types, Derived expression types @subsection Conditionals @deffn {library syntax} cond @dots{}, @emph{Syntax:} Each @r{} should be of the form @format @t{(@r{} @r{} @dots{},) } @end format where @r{} is any expression. Alternatively, a @r{} may be of the form @format @t{(@r{} => @r{}) } @end format The last @r{} may be an ``else clause,'' which has the form @format @t{(else @r{} @r{} @dots{},)@r{.} } @end format @cindex @w{else} @cindex @w{=>} @emph{Semantics:} A @samp{cond} expression is evaluated by evaluating the @r{} expressions of successive @r{}s in order until one of them evaluates to a true value (see @cindex @w{true} section @pxref{Booleans}). When a @r{} evaluates to a true value, then the remaining @r{}s in its @r{} are evaluated in order, and the result(s) of the last @r{} in the @r{} is(are) returned as the result(s) of the entire @samp{cond} expression. If the selected @r{} contains only the @r{} and no @r{}s, then the value of the @r{} is returned as the result. If the selected @r{} uses the @code{=>} alternate form, then the @r{} is evaluated. @vindex @w{=>} Its value must be a procedure that accepts one argument; this procedure is then called on the value of the @r{} and the value(s) returned by this procedure is(are) returned by the @samp{cond} expression. If all @r{}s evaluate to false values, and there is no else clause, then the result of the conditional expression is unspecified; if there is an else clause, then its @r{}s are evaluated, and the value(s) of the last one is(are) returned. @format @t{(cond ((> 3 2) 'greater) ((< 3 2) 'less)) ==> greater (cond ((> 3 3) 'greater) ((< 3 3) 'less) (else 'equal)) ==> equal (cond ((assv 'b '((a 1) (b 2))) => cadr) (else #f)) ==> 2 } @end format @end deffn @deffn {library syntax} case @r{} @dots{}, @emph{Syntax:} @r{} may be any expression. Each @r{} should have the form @format @t{((@r{} @dots{},) @r{} @r{} @dots{},)@r{,} } @end format where each @r{} is an external representation of some object. All the @r{}s must be distinct. The last @r{} may be an ``else clause,'' which has the form @format @t{(else @r{} @r{} @dots{},)@r{.} } @end format @vindex else @emph{Semantics:} A @samp{case} expression is evaluated as follows. @r{} is evaluated and its result is compared against each @r{}. If the result of evaluating @r{} is equivalent (in the sense of @samp{eqv?}; see section @pxref{Equivalence predicates}) to a @r{}, then the expressions in the corresponding @r{} are evaluated from left to right and the result(s) of the last expression in the @r{} is(are) returned as the result(s) of the @samp{case} expression. If the result of evaluating @r{} is different from every @r{}, then if there is an else clause its expressions are evaluated and the result(s) of the last is(are) the result(s) of the @samp{case} expression; otherwise the result of the @samp{case} expression is unspecified. @format @t{(case (* 2 3) ((2 3 5 7) 'prime) ((1 4 6 8 9) 'composite)) ==> composite (case (car '(c d)) ((a) 'a) ((b) 'b)) ==> @emph{unspecified} (case (car '(c d)) ((a e i o u) 'vowel) ((w y) 'semivowel) (else 'consonant)) ==> consonant } @end format @end deffn @deffn {library syntax} and @dots{}, The @r{} expressions are evaluated from left to right, and the value of the first expression that evaluates to a false value (see section @pxref{Booleans}) is returned. Any remaining expressions are not evaluated. If all the expressions evaluate to true values, the value of the last expression is returned. If there are no expressions then @t{#t} is returned. @format @t{(and (= 2 2) (> 2 1)) ==> #t (and (= 2 2) (< 2 1)) ==> #f (and 1 2 'c '(f g)) ==> (f g) (and) ==> #t } @end format @end deffn @deffn {library syntax} or @dots{}, The @r{} expressions are evaluated from left to right, and the value of the first expression that evaluates to a true value (see section @pxref{Booleans}) is returned. Any remaining expressions are not evaluated. If all expressions evaluate to false values, the value of the last expression is returned. If there are no expressions then @t{#f} is returned. @format @t{(or (= 2 2) (> 2 1)) ==> #t (or (= 2 2) (< 2 1)) ==> #t (or #f #f #f) ==> #f (or (memq 'b '(a b c)) (/ 3 0)) ==> (b c) } @end format @end deffn @node Binding constructs, Sequencing, Conditional, Derived expression types @subsection Binding constructs The three binding constructs @samp{let}, @samp{let*}, and @samp{letrec} give Scheme a block structure, like Algol 60. The syntax of the three constructs is identical, but they differ in the regions they establish @cindex @w{region} for their variable bindings. In a @samp{let} expression, the initial values are computed before any of the variables become bound; in a @samp{let*} expression, the bindings and evaluations are performed sequentially; while in a @samp{letrec} expression, all the bindings are in effect while their initial values are being computed, thus allowing mutually recursive definitions. @deffn {library syntax} let @r{} @r{} @emph{Syntax:} @r{} should have the form @format @t{((@r{} @r{}) @dots{},)@r{,} } @end format where each @r{} is an expression, and @r{} should be a sequence of one or more expressions. It is an error for a @r{} to appear more than once in the list of variables being bound. @emph{Semantics:} The @r{}s are evaluated in the current environment (in some unspecified order), the @r{}s are bound to fresh locations holding the results, the @r{} is evaluated in the extended environment, and the value(s) of the last expression of @r{} is(are) returned. Each binding of a @r{} has @r{} as its region. @cindex @w{region} @format @t{(let ((x 2) (y 3)) (* x y)) ==> 6 (let ((x 2) (y 3)) (let ((x 7) (z (+ x y))) (* z x))) ==> 35 } @end format See also named @samp{let}, section @ref{Iteration}. @end deffn @deffn {library syntax} let* @r{} @r{} @emph{Syntax:} @r{} should have the form @format @t{((@r{} @r{}) @dots{},)@r{,} } @end format and @r{} should be a sequence of one or more expressions. @emph{Semantics:} @samp{Let*} is similar to @samp{let}, but the bindings are performed sequentially from left to right, and the region of a binding indicated @cindex @w{region} by @samp{(@r{} @r{})} is that part of the @samp{let*} expression to the right of the binding. Thus the second binding is done in an environment in which the first binding is visible, and so on. @format @t{(let ((x 2) (y 3)) (let* ((x 7) (z (+ x y))) (* z x))) ==> 70 } @end format @end deffn @deffn {library syntax} letrec @r{} @r{} @emph{Syntax:} @r{} should have the form @format @t{((@r{} @r{}) @dots{},)@r{,} } @end format and @r{} should be a sequence of one or more expressions. It is an error for a @r{} to appear more than once in the list of variables being bound. @emph{Semantics:} The @r{}s are bound to fresh locations holding undefined values, the @r{}s are evaluated in the resulting environment (in some unspecified order), each @r{} is assigned to the result of the corresponding @r{}, the @r{} is evaluated in the resulting environment, and the value(s) of the last expression in @r{} is(are) returned. Each binding of a @r{} has the entire @samp{letrec} expression as its region, making it possible to @cindex @w{region} define mutually recursive procedures. @format @t{@c (letrec ((x 2) (y 3)) @c (letrec ((foo (lambda (z) (+ x y z))) (x 7)) @c (foo 4))) \ev 14 (letrec ((even? (lambda (n) (if (zero? n) #t (odd? (- n 1))))) (odd? (lambda (n) (if (zero? n) #f (even? (- n 1)))))) (even? 88)) ==> #t } @end format One restriction on @samp{letrec} is very important: it must be possible to evaluate each @r{} without assigning or referring to the value of any @r{}. If this restriction is violated, then it is an error. The restriction is necessary because Scheme passes arguments by value rather than by name. In the most common uses of @samp{letrec}, all the @r{}s are lambda expressions and the restriction is satisfied automatically. @c \todo{use or uses? --- Jinx.} @end deffn @node Sequencing, Iteration, Binding constructs, Derived expression types @subsection Sequencing @deffn {library syntax} begin @dots{}, The @r{}s are evaluated sequentially from left to right, and the value(s) of the last @r{} is(are) returned. This expression type is used to sequence side effects such as input and output. @format @t{(define x 0) (begin (set! x 5) (+ x 1)) ==> 6 (begin (display "4 plus 1 equals ") (display (+ 4 1))) ==> @emph{unspecified} @emph{and prints} 4 plus 1 equals 5 } @end format @end deffn @node Iteration, Delayed evaluation, Sequencing, Derived expression types @subsection Iteration @c \unsection @noindent @deffn {library syntax} (do ((@r{} @r{} @r{}) @t{ @dots{},) (@r{} @r{} @dots{},) @r{} @dots{},)} @cindex @w{do} @samp{Do} is an iteration construct. It specifies a set of variables to be bound, how they are to be initialized at the start, and how they are to be updated on each iteration. When a termination condition is met, the loop exits after evaluating the @r{}s. @samp{Do} expressions are evaluated as follows: The @r{} expressions are evaluated (in some unspecified order), the @r{}s are bound to fresh locations, the results of the @r{} expressions are stored in the bindings of the @r{}s, and then the iteration phase begins. Each iteration begins by evaluating @r{}; if the result is false (see section @pxref{Booleans}), then the @r{} expressions are evaluated in order for effect, the @r{} expressions are evaluated in some unspecified order, the @r{}s are bound to fresh locations, the results of the @r{}s are stored in the bindings of the @r{}s, and the next iteration begins. If @r{} evaluates to a true value, then the @r{}s are evaluated from left to right and the value(s) of the last @r{} is(are) returned. If no @r{}s are present, then the value of the @samp{do} expression is unspecified. The region of the binding of a @r{} @cindex @w{region} consists of the entire @samp{do} expression except for the @r{}s. It is an error for a @r{} to appear more than once in the list of @samp{do} variables. A @r{} may be omitted, in which case the effect is the same as if @samp{(@r{} @r{} @r{})} had been written instead of @samp{(@r{} @r{})}. @format @t{(do ((vec (make-vector 5)) (i 0 (+ i 1))) ((= i 5) vec) (vector-set! vec i i)) ==> #(0 1 2 3 4) (let ((x '(1 3 5 7 9))) (do ((x x (cdr x)) (sum 0 (+ sum (car x)))) ((null? x) sum))) ==> 25 } @end format @c \end{entry} @deffnx {library syntax} let @r{} @r{} @r{} ``Named @samp{let}'' is a variant on the syntax of @code{let} which provides @vindex @w{let} a more general looping construct than @samp{do} and may also be used to express recursions. It has the same syntax and semantics as ordinary @samp{let} except that @r{} is bound within @r{} to a procedure whose formal arguments are the bound variables and whose body is @r{}. Thus the execution of @r{} may be repeated by invoking the procedure named by @r{}. @c | <-- right margin @format @t{(let loop ((numbers '(3 -2 1 6 -5)) (nonneg '()) (neg '())) (cond ((null? numbers) (list nonneg neg)) ((>= (car numbers) 0) (loop (cdr numbers) (cons (car numbers) nonneg) neg)) ((< (car numbers) 0) (loop (cdr numbers) nonneg (cons (car numbers) neg))))) ==> ((6 1 3) (-5 -2)) } @end format @end deffn @node Delayed evaluation, Quasiquotation, Iteration, Derived expression types @subsection Delayed evaluation @deffn {library syntax} delay @r{} @ignore todo Fix. @end ignore The @samp{delay} construct is used together with the procedure @code{force} to @vindex @w{force} implement @dfn{lazy evaluation} or @dfn{call by need}. @cindex @w{call by need} @cindex @w{lazy evaluation} @t{(delay @r{})} returns an object called a @dfn{promise} which at some point in the future may be asked (by @cindex @w{promise} the @samp{force} procedure) @ignore todo Bartley's white lie; OK? @end ignore to evaluate @r{}, and deliver the resulting value. The effect of @r{} returning multiple values is unspecified. See the description of @samp{force} (section @pxref{Control features}) for a more complete description of @samp{delay}. @end deffn @node Quasiquotation, , Delayed evaluation, Derived expression types @subsection Quasiquotation @deffn {syntax} quasiquote @r{} @deffnx {syntax} @t{`}@r{} ``Backquote'' or ``quasiquote'' expressions are useful @cindex @w{backquote} for constructing a list or vector structure when most but not all of the desired structure is known in advance. If no commas appear within the @r{}, the result of @cindex @w{comma} evaluating @t{`}@r{} is equivalent to the result of evaluating @t{'}@r{}. If a comma appears within the @cindex @w{,} @r{}, however, the expression following the comma is evaluated (``unquoted'') and its result is inserted into the structure instead of the comma and the expression. If a comma appears followed immediately by an at-sign (@@), then the following @cindex @w{,@@} expression must evaluate to a list; the opening and closing parentheses of the list are then ``stripped away'' and the elements of the list are inserted in place of the comma at-sign expression sequence. A comma at-sign should only appear within a list or vector @r{}. @c struck: "(in the sense of {\cf equal?})" after "equivalent" @format @t{`(list ,(+ 1 2) 4) ==> (list 3 4) (let ((name 'a)) `(list ,name ',name)) ==> (list a (quote a)) `(a ,(+ 1 2) ,@@(map abs '(4 -5 6)) b) ==> (a 3 4 5 6 b) `((@samp{foo} ,(- 10 3)) ,@@(cdr '(c)) . ,(car '(cons))) ==> ((foo 7) . cons) `#(10 5 ,(sqrt 4) ,@@(map sqrt '(16 9)) 8) ==> #(10 5 2 4 3 8) } @end format Quasiquote forms may be nested. Substitutions are made only for unquoted components appearing at the same nesting level as the outermost backquote. The nesting level increases by one inside each successive quasiquotation, and decreases by one inside each unquotation. @format @t{`(a `(b ,(+ 1 2) ,(foo ,(+ 1 3) d) e) f) ==> (a `(b ,(+ 1 2) ,(foo 4 d) e) f) (let ((name1 'x) (name2 'y)) `(a `(b ,,name1 ,',name2 d) e)) ==> (a `(b ,x ,'y d) e) } @end format The two notations @t{`}@r{} and @t{(quasiquote @r{})} are identical in all respects. @samp{,@r{}} is identical to @samp{(unquote @r{})}, and @samp{,@@@r{}} is identical to @samp{(unquote-splicing @r{})}. The external syntax generated by @code{write} for two-element lists whose @vindex @w{write} car is one of these symbols may vary between implementations. @cindex @w{`} @format @t{(quasiquote (list (unquote (+ 1 2)) 4)) ==> (list 3 4) '(quasiquote (list (unquote (+ 1 2)) 4)) ==> `(list ,(+ 1 2) 4) @emph{}i.e., (quasiquote (list (unquote (+ 1 2)) 4)) } @end format Unpredictable behavior can result if any of the symbols @code{quasiquote}, @code{unquote}, or @code{unquote-splicing} appear in @vindex @w{unquote-splicing} @vindex @w{unquote} @vindex @w{quasiquote} positions within a @r{} otherwise than as described above. @end deffn @node Macros, , Derived expression types, Expressions @section Macros @menu * Binding constructs for syntactic keywords:: * Pattern language:: @end menu Scheme programs can define and use new derived expression types, called @emph{macros}. @cindex @w{macro} Program-defined expression types have the syntax @example (@r{} @r{} ...) @end example where @r{} is an identifier that uniquely determines the expression type. This identifier is called the @emph{syntactic keyword}, or simply @emph{keyword}, of the macro. The @cindex @w{macro keyword} @cindex @w{keyword} @cindex @w{syntactic keyword} number of the @r{}s, and their syntax, depends on the expression type. Each instance of a macro is called a @emph{use} @cindex @w{macro use} of the macro. The set of rules that specifies how a use of a macro is transcribed into a more primitive expression is called the @emph{transformer} @cindex @w{macro transformer} of the macro. The macro definition facility consists of two parts: @itemize @bullet @item A set of expressions used to establish that certain identifiers are macro keywords, associate them with macro transformers, and control the scope within which a macro is defined, and @item a pattern language for specifying macro transformers. @end itemize The syntactic keyword of a macro may shadow variable bindings, and local variable bindings may shadow keyword bindings. All macros @cindex @w{keyword} defined using the pattern language are ``hygienic'' and ``referentially transparent'' and thus preserve Scheme's lexical scoping [Kohlbecker86], [ hygienic], [Bawden88], [macrosthatwork], [syntacticabstraction]: @cindex @w{hygienic} @cindex @w{referentially transparent} @itemize @bullet @item If a macro transformer inserts a binding for an identifier (variable or keyword), the identifier will in effect be renamed throughout its scope to avoid conflicts with other identifiers. Note that a @code{define} at top level may or may not introduce a binding; see section @ref{Definitions}. @item If a macro transformer inserts a free reference to an identifier, the reference refers to the binding that was visible where the transformer was specified, regardless of any local bindings that may surround the use of the macro. @end itemize @vindex @w{define} @c The low-level facility permits non-hygienic macros to be written, @c and may be used to implement the high-level pattern language. @c The fourth section describes some features that would make the @c low-level macro facility easier to use directly. @node Binding constructs for syntactic keywords, Pattern language, Macros, Macros @subsection Binding constructs for syntactic keywords @samp{Let-syntax} and @samp{letrec-syntax} are analogous to @samp{let} and @samp{letrec}, but they bind syntactic keywords to macro transformers instead of binding variables to locations that contain values. Syntactic keywords may also be bound at top level; see section @ref{Syntax definitions}. @deffn {syntax} let-syntax @r{} @r{} @emph{Syntax:} @r{} should have the form @format @t{((@r{} @r{}) @dots{},) } @end format Each @r{} is an identifier, each @r{} is an instance of @samp{syntax-rules}, and @r{} should be a sequence of one or more expressions. It is an error for a @r{} to appear more than once in the list of keywords being bound. @emph{Semantics:} The @r{} is expanded in the syntactic environment obtained by extending the syntactic environment of the @samp{let-syntax} expression with macros whose keywords are the @r{}s, bound to the specified transformers. Each binding of a @r{} has @r{} as its region. @format @t{(let-syntax ((when (syntax-rules () ((when test stmt1 stmt2 ...) (if test (begin stmt1 stmt2 ...)))))) (let ((if #t)) (when if (set! if 'now)) if)) ==> now (let ((x 'outer)) (let-syntax ((m (syntax-rules () ((m) x)))) (let ((x 'inner)) (m)))) ==> outer } @end format @end deffn @deffn {syntax} letrec-syntax @r{} @r{} @emph{Syntax:} Same as for @samp{let-syntax}. @emph{Semantics:} The @r{} is expanded in the syntactic environment obtained by extending the syntactic environment of the @samp{letrec-syntax} expression with macros whose keywords are the @r{}s, bound to the specified transformers. Each binding of a @r{} has the @r{} as well as the @r{} within its region, so the transformers can transcribe expressions into uses of the macros introduced by the @samp{letrec-syntax} expression. @format @t{(letrec-syntax ((my-or (syntax-rules () ((my-or) #f) ((my-or e) e) ((my-or e1 e2 ...) (let ((temp e1)) (if temp temp (my-or e2 ...))))))) (let ((x #f) (y 7) (temp 8) (let odd?) (if even?)) (my-or x (let temp) (if y) y))) ==> 7 } @end format @end deffn @node Pattern language, , Binding constructs for syntactic keywords, Macros @subsection Pattern language A @r{} has the following form: @deffn {} syntax-rules @r{} @r{} @dots{}, @emph{Syntax:} @r{} is a list of identifiers and each @r{} should be of the form @format @t{(@r{} @r{