views:

102

answers:

1

I apologize in advance for my primitive english; i will try my best to avoid grammatical errors and such.

Two weeks ago i decided to freshen my knowledge of Scheme (and its enlightnings) whilst implementing some math material i got between hands, specifically, Regular Languages from a course on Automata theory and Computation in which i am enrolled.

So far, i've been representing alphabets as lists of symbols instead of

  1. lists of chars because i want to have letters of variable size.
  2. lists of strings because i somewhat feel thats unelegant.

I'm inexperienced and would like to know your opinion on this particular choice. Are symbols reserved for some particular kind of task and by this am i misusing them? Any comment on this is very appreciated for i'm seeking guidance.

On a further reach, there will be also the time to implement the set of all possible words over an alphabet, which is infinite. I was thinking to limit the set by the maximum size of words permitted. Again, is this a good practice or should i instead go for the streams? I feel that streams would be a better approach but i haven't learned them yet so i don't really know the implications are of using them.

Anyhow, any suggestion or commentary is welcomed. I really appreciate the time you took to read my doubts. Have a nice weekend!

+4  A: 

The simple difference is that symbols are extremely cheap to compare. One can use eq? to compare two symbols as identical. This is because during compile time, the compiler will essentially enumerate all symbols with a number, so you're just comparing integers, a very cheap machine operation.

This means that two symbols are identical if and only if they are composed from the same characters. There is no way to discern two symbols that have the same characters in code, they are constants, they are the same objects, just as 3 and 3.

Two strings however can very well be different objects residing in different locations of memory, and to compare them requires comparing every character separately for a match.

So symbols should, and are often used for this, for instance, considering:

(assq 'symb  '((foo 1 2 3) (bar symb) (symb "match")))

This returns the list (symb "match") this is as cheap as comparing:

(assq 4  '((0 1 2 3) (1 symb) (4 "match")))

Which returns the list (4 "match"). However, when using strings as keys assq no longer suffices, which uses the eq? procedure for comparison, rather assoc must be used, which uses the equal? procedure, which is a lot more costly since it recursively compares structure. The above implementation is actually cheap enough to often be used as a way to model associative arrays and environments in interpreters, even though it's certainly not random access.

Edit: As you asked, on streams:

The Scheme standard supports a nice pair, one is a function called force, the other though, is a special form called delay. Effectively what

(delay (+ 1 2 3))

Or any other code in lieu of (+ 1 2 3) returns is a so-called 'promise', it delays the computation of the answer in that promise, but promises that the result will be identical when evaluated to that 6 we get there. This might seem quite useless, but say the result depends on some variable, which can be changed:

(define x 4) ; x is bound to 4.
(let ((promise (delay (+ 2 x)))) ; evaluating the expression now would result into 6, but we delay it.
  (set! x 5) ; we change the value of x.
  (display (force promise)) ; displays 7
  (set! x 6) ; change it again
  (force promise) ; ALSO displays 7, not 8
)

It becomes apparent that the promise is indeed evaluated only once, and when forced again, it yields the same value as when evaluated the first time.

This is what is typically used in streams, or conceptual 'infinite lists'. The cdr of the list in this case is a promise for the rest of the list which is forced when it's retrieved, as else it would turn into a nonterminating computation, for instance, commonly we define:

(define (stream-cdr x) (force (cdr x))) ; it forces that promise to evaluate it.
; and optionally
(define stream-car car) ; same

As this one can't evaluate its argument, it needs to be a special form:

(define-syntax stream-cons
  (syntax-rules ()
    ((stream-cons x y)
     (cons x (delay y))))

Your Scheme compiler or interpreter will now translate every occurence of (stream-cons x y) to (cons x (delay y)) for any arbitrary x and y.

So, as a simple example, now that our evaluation is delayed until we need it, we can create an infinite list of zeros:

(define zero-stream (stream-cons 0 zero-streams))

A list consed up to itself, which would surely never terminate if we hadn't used a stream, useless, but it shows the idea. We can just use stream-cdr over and over again on this without ever either reaching an empty list, we just get the same 'infinite list of 0's' back again.

A more practical example would be the list of all fibonacci numbers, which is some-what more complex:

(define fibs 
  (stream-cons 0 (stream-cons 1
    (stream-map + fibs (stream-cdr fibs))))

Stream-map is the analogue of the normal map, it's definition is quite complex but I'm sure you can look it up, it generates a stream itself. So `(stream-map (lambda (x y) (+ 1 x y)) zeroes zeroes) would generate a stream completely filled with one's. The fibs stream is recursively defined in itself. The first two elements are given and the rest is computed from two streams which just happen to be fibs and the stream-cdr of fibs itself.

Lajla
Cheers! Yours a very easy and concise explanation of the implication of using symbols. Completely understood everything you said without a sweat :) I'll stick to them then.By the way, since i feel this thread won't attract any more answers, do you have a spare word on streams Or any other commentary or suggestion to a newbie like me?Either way, thanks again for taking the time to answer. Much appreciated.
Landau
Done, editted it.
Lajla
Wonderful exposition! Thank you very much :)
Landau