Scheme in Short (Finished!)

Programming, for all ages and all languages.
Schol-R-LEA

Scheme in Short (Finished!)

Post by Schol-R-LEA »

People who don't know LISP have no idea how simple it really is, at the heart of it. Here is a quick synopsis of the Scheme dialect, just to give a taste of it. This is meant primarily for experienced programmers who aren't familiar with applicative programming. For a more detailed explanation of the language, see "Revision 5 Report on the Algorithmic Language Scheme", the most widely used standard for the Scheme language today.

Atoms
The building blocks of Scheme programs are atoms and lists. Atoms are single objects, and can be either identifiers or literals:

Code: Select all

; comments begin with semi-colons and go to the end of the line.

An-Atom         ; the identifier 'An-Atom'
1234            ; the integer '1234'
17/23           ; the rational number seventeen twenty-thirds
3.14            ; the exact rational number 3.14
#i3.14          ; an inexact real number approximating 3.14
#\a             ; the character 'a'
"Hello, World!" ; the string "Hello, World!"
#t              ; special atom for the boolean value 'true'
#f              ; special atom for the boolean value'false'
While Scheme's approach to identifiers, characters, strings and non-integer numbers is unusual, it is fairly self-consistent.

Lists
A list is a group of one or more atoms in a pair of parentheses (most versions allow you to substitute brackets or even braces for parentheses, so long as each member of a pair matches the other). For example,

Code: Select all

(a b c)
is a list whose members are the identifiers 'a', 'b', and 'c' (technically, there is also a null list following the 'c', but that isn't too important right now). Lists can be nested, like so:

Code: Select all

((a)  (b c) d)
Functions
Another key concept in Scheme is that code is represented the same way as data, and that data can be evaluated as code. When a Scheme interpreter reads a list, it tries to interpret the first value in the list as a function call. For example,

Code: Select all

(list 1 2)
is interpreted as a function call to '(list)', a function that reads two or more atoms and returns a list. The function would return

Code: Select all

(1 2)
as its evaluation.

With a handful of exceptions, all Scheme operations are performed in the form of a function call. For example, to add 2 to 3, you would write:

Code: Select all

(+ 2 3)
An extended equation would be written as nested calls:

Code: Select all

(/  (- 3  #i3.14)  (* (+ 1000 56) 5))
Since Scheme code is freeform, it cna be written in a more readable manner like this:

Code: Select all

(/  (- 3  #i3.14) 
    (* (+ 1000 56)
        5))
Some Scheme functions are built in primitives which, while resembling function calls, have slightly different behavior from an ordinary function. These functions are called special forms.

Define
A special form, (define), exists to assigning a value of an identifier. It can be used both for initializing variables and for defining functions. Given the following definitions,

Code: Select all

(define one 1)

(define (inc x) (+ one x))
the call [tt](inc 3)[/tt] would, predictably, return '4'. The (define) special form cannot be used to modify the value of an identifier once it has been initialized.

Evaluation and Quoting
When the interpreter reads an identifier, by default it tries to evaluate it, either as a function name if it is at the beginnning of a list, or otherwise as a variable. For example, given the example above, entering 'one' at the interpreter prompt would return '1'.

> one
1

While a list starting with 'inc' would be evaluated in this manner, if one were to trace the evaluation:

> (inc one)
---> (inc 1)
---> (+ one 1)
---> (+ 1 1)
2

This evaluation can be overriden using the (quote) special form:

Code: Select all

(quote a)
This can (and almost always is) abbreviated using the apostrophe, like so:

Code: Select all

'a
Quoting an atom allows you to manipulate the atom itself, rather than its value; the same can be done with an entire list. Given the following definitions:

Code: Select all

(define one 1)

(define two 2)

(define (square x) 
           (* x x))
Then entering the following at the interpreter prompt will have the results:

> (list one two (square two))
(1 2 4)

but if you quote the atoms, the result is

>(list 'one 'two '(square two))
(one two (square two))

(Note that if you are using DR Scheme, the '"Beginner Student" language subset doesn't support quoting lists, but "Advanced Student" and "Standard (R5RS)" do).

I'll add more later, but this should make a good start. I'll cover conditionals, predicates and recursion next, then the list manipulation functions. I'm not sure what else I'll add. Eventually I'll discuss mutators (set!, set-car!, etc.), but they aren't really as important in Scheme as it is in most other languages.
Schol-R-LEA

Scheme in Short, Part 2

Post by Schol-R-LEA »

Binding
The binding special form, (let), allows you to define and initialize a group of local variables, which can be be calculated off of the arguments:

Code: Select all

(let ((var-1  (value-1))
      (var-2 (value-2))
      ; ...
      (var-n (value-n)))
     (expression))
Sequential Expressions
So far, each Scheme clause has only allow for a single expression, whose value is always returned immediately. There are many circumstances when a sequence of expressions need to be performed for their side effects, and ignoring the return values of most or all of them. In Scheme, the body of a function is a sequential expression block, with the value of the last expression being used as the value of the function. In cases where only one expression is expected, one can use the (begin) special form:

Code: Select all

(begin ((expression-1)
        (expression-2)
        (expression-3)
        ; ... 
        (final-expression)))
As with functions, the value of the final expression is used as the value of the whole block.

Comparisons
As with any Turing-complete language, Scheme has a mechanism for comparing different values and branching two or more possibilities based on the comparison. The usual set of equality, and inequality comparators are available: (equal?), (<) (<=), (>), and (>=) . There are also the logical functions, (not), (and), and (or), as well as specialized equivalence forms for specific types (string=?), etc..

The basic selector special forms are (if), (case) and (cond). (If) is used for simple two-way branch or skip:

Code: Select all

(if (condition) (true-expression))

or

(if (condition) (true-expression) (false-expression))
The if form returns the value of the clause that it runs; if the condition is false and there is no else clause, the result is undefined.

the (case) special form performs a multi-path branch based on a single value:

Code: Select all

(case (value) ((comparison-1a comparison-1b comparison-1c) (block-1))
              ((comparison-2a comparison-2b comparison-2c) (block-2))
              ; ...
              (else (default-block)))
If value matches one of the comparison vlaues in a list, then the following expression block is executed.

The expression sections are blocks, and can take more than one expression.

The (cond) special form is the most general of the selector forms; it accepts a set of condition-expression pairs and selects through them in order. As with (case), the expression clause is a block.

Code: Select all

(cond ((cond-1) (block-1))
      ((cond-2) (block-2))
      ; ...
      (else (default-block))
In addition to the selectors and comparators, there are a set of predicates, special forms which accept an argument and determine the arguements membership in a particular type. Predicates are customarily suffixed with a question mark to indicate what they are, but exceptions exist. Important predicates include (null?), (list?), (atom?), (string?), (char?). These dynamic type-checking forms are used in place of the static type checking used in most languages.

Looping
With these selectors, comparators and predicates, it becomes possible to perform repetitive actions by way of recursive function calls, like this factorial implementation:

Code: Select all

(define (! n) 
  (cond ((not (integer? n)) 'Invalid-value)
        ((> 0 n) (! (abs n)))
        ((= 0 n) 1)
        (else (* n (! (- n 1))))))
A standards-compliant Scheme compiler will correctly optimize a tail-recursive function into an iterative loop, so there is no need, in principle, for special iterative forms. To support iterative tail recursion, there is a special variant of (let), called the 'named let'. It consists of an identifier, followed by the loop's arguments ands their initializers, followed by an expression block. For example, the factorial function could be rewritten with a named let as such:

Code: Select all

(define (fact-named x)
  (cond ((not (integer? x)) 'invalid-argument)
        ((> 0 x)(fact-named (abs x)))
        (else 
         (let loop ((i 1)
                   (j 1))
           (if (>= i x)
              (* i j)
              (loop (+ 1 i) (* i j)))))))
The Scheme language does provide an iteterative form, (do). The syntax for (do) is a group of local variable definitions, with optional step clauses to be applied on each iteration, followed by a loop test clause, then followed by the body of the loop (again, a block structure):

Code: Select all

(do ((variable (initializer) (increment)) 
      (test)
      (block))
For example, the factorial above could be written as

Code: Select all

(define (fact-do x)
    (cond ((not (integer? x)) 'invalid-argument)
        ((> 0 x) (fact-named (abs x)))
        (else 
           (do ((i 1 (+ 1 i))
                (j 1 (* j i)))
             ((> i x) j)))))
Schol-R-LEA

Scheme in Short

Post by Schol-R-LEA »

Application
Scheme has a special form for applying a given function to each atom in a list, (map). For example, to get the factorials of a whole list of integers, you could use (map) at the prompt:

> (map ! '(1 6 3 11))
(1 720 6 39916800)

If the mapped function takes more than one argument, map requires a list of equal size for each argument:

> (map + '( 17 23 42) '(57 69 666))
(74 92 708)
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Lists Revisited
In Scheme, lists actually are made up from cons cells. A cons cell is a pair of pointers. For historical reasons, the object pointed to by the first pointer is called the CAR, while the object pointed to by the second pointer is called the CDR. The usual box-notation for diagramming cons cell is:

Code: Select all

 --- ---    
| * | * |->CDR 
 --- --- 
  |
 \ /
 CAR
Each pointer can point to an atom, another cons cell, or a null value (also called the empty list). A proper list is one in which every CDR is another cons cell except the last, which points to the empty list. for example, the proper list (a (b c) d) would be

Code: Select all

 --- ---     --- ---      --- ---  
| * | * |-> | * | * | -> | * | * | -> null  
 --- ---     --- ---      --- ---
  |           |            |
 \ /          |           \ /
  a           |            d
             \ /   
             --- ---      --- ---  
            | * | * | -> | * | * | -> null
             --- ---      --- ---  
              |            |
             \ /          \ /
              b            c 
The pairs can be represented explicitly in the list by means of what is called dotted-pair notation. For example,

> '(a . ((b . (c . ())) . (d . ())))
(a (b c) d)

A list in which the CDR element is an atom is called an improper list; for example, (a . b) represents the improper list

Code: Select all

 --- ---    
| * | * |->b 
 --- --- 
  |
 \ /
 a
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

List Constructors
Basing both data structures and code on lists as it does, Scheme naturally enough has a set of primitives for creating new lists from existing list and atoms. The (list) form has already been mentioned; it accepts two or more atoms or lists and returns a proper list based on them:

> (list 'a 'b '(c d (e f) g) 'h)
(a b (c d (e f) g) h)

Calls can be nested; the previous call could have been written

> (list 'a 'b (list 'c 'd (list 'e 'f) 'g) 'h)
(a b (c d (e f) g) h)

A more basic function, (cons), takes exactly two arguments, and returns a cons cell with the first arg as it's CAR and the second as it's CDR.

> (cons 'a 'b)
(a . b)

> (cons 'a '(b))
(a b)

> (cons 'a (cons (cons 'b (cons 'c '())) (cons 'd '())))
(a (b c) d)

Ther is also (append), which takes a list as the first argument and appends the second argument to it;

> (append '(a (bc)) 'd)
(a (bc) . d)

> (append '(a (b c)) '(d))
(a (b c) d)

List Decomposition
The CAR or CDR of a cons cell can be extracted using two other functions, predictably called (car) and (cdr).

> (car '(a (bc) d))
a

> (cdr '(a (bc) d))
((bc) d)

While many alternative names have been proposed for these two functions (head/tail, top/bottom, first/rest), these otherwise troublesome names have been retained because of one advantage: they can be composed easily into more complex functions:
> (car (cdr '(a (b c) d)))
(b c)

can be composed into

> (cadr '(a (b c) d))
(b c)

The Scheme standard mandates that these composed functions can up to a depth of at least 4 compositions (e.g., caadr, cdadr), and many interpeters allow an indefinite number of them.

List Manipulation
There are a variety of additional functions that compare lists and extract sublists. These include (list-ref), which returns the n member (zero-indexed) of a list:
> (list-ref '(a (b c) d) 0)
a
> (list-ref '(a (b c) d) 1)
(b c)
> (list-ref '(a (b c) d) 2)
d

(list-tail), which returns the sub-list starting at (zero-index) member n

> (list-tail '(a (b c) d) 1)
((b c) d)

Note that [tt](list-ref a-list n)[/tt] is equivalent to [tt](car (list-tail a-list n))[/tt].

(length), which returns the length of a list;

> (length '(a (b c) d))
3

(reverse), which returns a list in reverse order of the original:

> (reverse '(a (b c) d))
(d (b c) a)

... and (member), which finds the first instance of a given element and returns the sublist that it is the CAR of (or false if it is not found):

>(member 'e '(a b c d e f g))
(e f g)

> (member '(b c) '(a (b c) d))
((b c) d)

> (member 'b '(a (b c) d))
#f
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Association Lists
A list consisting of one or more list of exactly two elements each is called an associative list, or a-list. A-lists are a common data structure used for indexing information; the first member of a pair is called the key, while the second member is the data being indexed. For example, a list of programmers and the systems they developed may go like this:

Code: Select all

(define OS-Developers '(("Linus Torvalds" "Linux") 
                        ("Richard Stallman" "HURD")
                        ("Bill Jolitz" "FreeBSD")
                        ("Pype-Clicker" "Clicker32")
                        ("Tim Robinson" "Mobius")
                        ("Solar" "Pro-POS")
                        ("Jay Osako" "Thelema") 
                        (" Robert Szeleney" "SkyOS")))
The function (assoc) takes an a-list and a key value, and returns the key-value pair (or #f if it is not found in the list).

> (assoc "Jay Osako" OS-Developers)
("Jay Osako" "Thelema")

> (assoc "HURD" OS-Developers)
#f

> (assoc "Richard Stallman" OS-Developers)
("Richard Stallman" "HURD")
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Lambda Functions
The LISP family of languages are based, in part, on a mathematical formalism call Lambda Calculus, a notation for recursive functions. While a detailed explanation of lambda calculus is outside of the scope of this tutorial, it is important to understand that in the lambda calculus, there exists classes of functions whose domains and/or ranges are other functions.

In Scheme (and most other LISP derived languages), the form (lambda) is a generator of new functions: given an argument list and a function body, (lambda) returns a function. For example, the following code creates a nameless function and applies it to the list (a (b c) d):
[tt]
> ((lambda (x)
(if (list? x) (cadr x)
#f))
'(a (b c) d))
(b c)[/tt]

The function created by a (lambda) can be assigned to an identifier using (define):

Code: Select all

(define second 
  (lambda (x)
    (if (list? x) (cadr x)
        #f)))
In fact, the usual notation of

Code: Select all

(define (second x)
  (if (list? x) (cadr x)
      #f))
is nothing more than a shorthand for the (lambda) version, in the form of a syntax macro. This is also true of several other structures, including (let) and (begin).

But (lambda) is more than just a peculiar way of defining ordinary functions: it allows you to generate functions programmatically, at runtime. For example, the following function takes two other functions and composes them:

Code: Select all

(define compose
  (lambda (f g)
    (lambda (arg)
      (f (g arg)))))
This can be then used like so:

> ((compose (lambda (y) (* 2 y))
(lambda (z) (+ 3 z)))
5)
16

> ((compose (lambda (x) (* x x)) sqrt) -1)
-1

Lambda expressions may be used anywhere a function could be expected.
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Mutators
In most languages, the assignment statement is among the most important. Scheme, however, is designed to support a programming paradigm known as 'functional programming', which aims to simplfy program design by minimizing the amount of global state. This is primarily done by reducing the number of side-effects a function has - ideally, a function should have no affect on global state other than returning its value, and should always return the same value for a given set of arguments, as a mathematical function would.

While some FP languages try to enforce this ideal, Scheme does not force the programmer's hand. In Scheme, there exist a set of re-assignment functions, or mutators, which are by convention marked with an exclamation point to indicate that they actually change the value of one or more of their arguments. The most important mutator is (set!) which replaces the value of its first argument with that of its second argument:

> (define a 1)
> a
1
> (set! a 2)
> a
2

Note that (set!) can only change a variable that has already been initialized with (define). There also two list mutators, (set-car!) and (set-cdr!), which, predictably, change only the CAR or the CDR of the modified variable.

Use of these mutators is rare in Scheme, and is mostly used for changing global tables and so forth. In general, it is possible to avoid them in nearly 90% of all Scheme code.
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Input and Output
As with all programming languages, Scheme has ways of communicating to users and to peripherals. The general abstraction used for I/O is the port, which works similarly to a stream in C. A given port can be either an input port or an output port, but not both. Ports can be opened using (open-input-file) or (open-output-file), and closed with (close-input-port) and (close-output-port).

The most commonly used output function is (display), which sends a message to a given port (or the current port if the second argument is omitted); (display) can print a literal value for any primitive type of atom or list. (read) can be used to read from an input port, and (write) can write to an output port. The (read-char) and (write-char) functions, not surprisingly, read and write single characters to the given port; (peek-char) can be used to read in a character without removing it from the input stream. The (newline) function can be used to end a line of text.

Program Modules
Scheme usually works from the interpreter prompt, and the whole workspace environment is visible to all functions. Saving a program usually means saving the entire workspace. Hoever, there is a mechanism for loading programs from files, the (load) form. It is used as so:

Code: Select all

(load "filename.scm")
One of the major deficiencies of standard Scheme is the lack of a comprehensive mechanism for controlling modules and namespaces. While several libraries exist to support them, none of them have become universally accepted, and the standards committee has persistently denied any requests to add such a functionality to the core language, on the basis that there is no recognized formal model for modules or namespaces.
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Hygenic Macros

Scheme, like virtually all LISP dialects, has a built-in macro facility. However, the Scheme 'hygenic macro' system is quite different from the traditional (defmacro) approach used in other dialects, and is one of the more controversial aspects of the language. They are, in essence, a mechanism for defining new syntactic structures

A Scheme macro is defined using two forms, (let-syntax) and (syntax-rules). (let-syntax) is the simpler of thes two: it takes an identifier and the list of rules returned by (syntax-rules). (syntax-rules), in turn, takes a list of 'syntax' identifiers to be used by the macro, and a series of list pairs. Each of these pairs consists of a rule, which acts as a template or pattern for the macro, and a result, which is the code actually emitted. For example, this is a simple macro implementing a (while) loop:

Code: Select all

(define-syntax while 
  ; a simple while loop
  (syntax-rules ()
    ((while condition expr1 expr2 ...)
     (let loop ()
       (if condition 
           (begin expr1 expr2 ... (loop)))))))
This can be thus used:

> (define x 1)
> x
1
> (while (< x 5)
(display x)
(newline)
(set! x (+ 1 x)))
1
2
3
4


The (syntax-rules) mechanism is very powerful; for example, here is a detailed implementation of a Pascal-like FOR loop syntax:

Code: Select all

(define-syntax for
  ; a Pascal-style for loop
  (syntax-rules (:= to downto step)
    ; rule for complete upwards for loop
    ((for index := start to finish step modifier expr1 expr2 ...)
     (begin
       (define index start)
       (let loop ()
         (if (<= index finish)
             (begin expr1 expr2 ... (set! index (+ index modifier)) (loop))
             (set! index (- index modifier)))))) ; adjust index to correct final value 
    
    ; rules for complete downwards for loop 
    ((for index := start downto finish step modifier expr1 expr2 ...)
     (begin 
       (define index start)
       (let loop ()
         (if (>= index finish)
             (begin expr1 expr2 ... (set! index (- index modifier)) (loop))
             (set! index (+ index modifier)))))) ; adjust index to correct final value 
    
    ; rule for upward loop with default step
    ((for index := start to finish expr1 expr2 ...)
     (for index := start to finish step 1 expr1 expr2 ...))
    
    ; rule for downward loop with default step
    ((for index := start downto finish expr1 expr2 ...)
     (for index := start downto finish step 1 expr1 expr2 ...))))
Which can then be used as so:

> (for x := 2 to 12 step 2
(display x)
(display " "))

2 4 6 8 10 12
> x
12

Hygenic macros have their own local variable space, like lambda functions do; as a result, they avoid the problem of variable capture. However, critics point out that variable capture, while often a problem, can be a useful tool, and claim that forcibly preventing it makes certain common idioms unfeasible. The complexity of the mechanism has also been heavily criticized: the terms 'fascist' and 'baroque' have both been applied to Scheme macro system. Nontheless, they are a vitally important part of the language; most of the basic facilities of the language are usually implemented as macros, including (if), (cond), (define), (and), (or) , (let), and (do).



OK, folks, I'm into the home stretch here. Of the major parts of the language, all that remain are the eval/apply loop, deferred evaluation, and continuations. While I'd rather avoid them, as they really deserve a lot more attention than I can give them, I can't see how I can; they are Scheme, the real core and conceptual basis of the language. It would be like discussing Unix networking without mentioning sockets.

I've skipped a lot of detail (most of the predicates and math routines, vectors,complex math, polar and rectangular coordinates, string handling, varadic functions, etc.), but I felt that those weren't really key to undertanding the language. I tried, really tried, to keep this brief, but in writing this I've come to realize more than ever just how much there is even to a small language like Scheme.

I'd like to hear what people think of this. If I get a favorable response, I may write tutorials for Smalltalk and FORTH as well (though I don't know either of them as well as I do Scheme). C&CW.
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Deferred Evaluation

In functional programming, it is expected that a function should always return the same value for a given set of arguments. Furthermore, there are circumstances in which, if the evaluation of a given operation is delayed until the last possible moment at which the value is needed, certain actions can be eliminated oe done in a more optimal manner. The principle of deferred evaluation, also called lazy evaluation, is to avoid actually performing a computation until it's value is needed, and to never repeat the same computation. Many FP language use lazy evaluation by default; Scheme, on the other hand, does not, but does allow you to use it explicitly using two forms, (delay) and (force).

(delay) accepts a expression and returns a marker called a promise, which represents the expression. When (force) is applied to a promise, the first time, it causes the expression to be evaluated, and memoizes - that is to say, caches - the result. Subsequent applications of (force) to the same promise return the memoized value.
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Continuations

Like hygenic macros, continuations are one of the more difficult parts of Scheme to understand, as well as one of the most hotly debated features. Conceptually, when an expression in Scheme is evaluated, a special object is created, called a continuation which represents the subsequent steps in the computation. This object can be reified and saved by the form (call-with-current-continuation) , which is usually shortened to (call/cc). When (call/cc) is used to call a function, it passes that function the continuation which existed at the time of the call; if the function then invokes this continuation, program execution is halted and restarted at the point after the (call/cc) call was made. For example,

Code: Select all

(define call/cc call-with-current-continuation)

(define (continued x) 
  (call/cc (lambda (return)
             (let loop ((y 0))
               (if (eq? x y) (return)
                   (begin 
                     (display y)
                     (newline)
                     (display "next? ")
                     (loop (read))))))))
In this example, the loop runs until the value of x is entered, at which point it escapes back to the caller. (call/cc) is most often used to escape from nested loops, but it is a completely general mechanism, which can be used for a variety of purposes, including implementing multitasking.
Schol-R-LEA

Re:Scheme in Short

Post by Schol-R-LEA »

Evaluation and Application
We have now covered most of the key elements of the Scheme language, as well as some unique facets; however, all that is prelude. For now we will penetrate the Heart of the Mystery, the core pair of functions that make up the Scheme interpreter itself: (eval) and (apply).

The (eval) special form is a function which accepts a list and an optional environment variable (the default is the global environment), and - surprise! - evaluates the elements of the list in terms of that environment. It takes the first element of the list and if it is one of the primitive operations of the system, or if the identifier is bound to a function in the current environment, it calls (apply) with that function as the first argument and the rest of the list as the second.

The (apply) function in turn takes the passed function, and then (eval)s each of the elements in the list in turn, before using the returned values as the arguments for the function. To evaluate a non-primitive function, it would create a new environment for the function, take the list representling the program code and pass them both to (eval). The cycle continues until the expressions cannot be reduced any further.

This is the Scheme interpreter at it's most basic. The (eval)/(apply) cycle is the heart of every LISP type language. Everything else is just detail: if you can understand this recursive evaluation, then you can understand Scheme.
Schol-R-LEA

Finished - any comments?

Post by Schol-R-LEA »

That should take care of everything important that I can think of. Any comments?
Eero Ränik

Re:Scheme in Short (Finished!)

Post by Eero Ränik »

Very nice...
Post Reply