ansaurus

Question

Help me understand how the conflict between immutability and running time is handled in Clojure.

Answer 1

+3 A:

Clojure's data structures are persistent, which means that they are immutable but use structural sharing to support efficient "modifications". See the section on immutable data structures in the Clojure docs for a more thorough explanation. In particular, it states

Specifically, this means that the new version can't be created using a full copy, since that would require linear time. Inevitably, persistent collections are implemented using linked data structures, so that the new versions can share structure with the prior version.

These posts, as well as some of Rich Hickey's talks, give a good overview of the implementation of persistent data structures.

Aaron Novstrup 2010-09-09 22:30:01

How complicated are deletes? Suppose I have 10 different sets or dictionaries, where they are created incrementally. And then I delete a node that is common to all 10 from the very first set/dict (the one containing just one element). What will clojure do and how long will it take?

Hamish Grubijan 2010-09-17 19:52:36

It doesn't matter whether the set/dictionary is built incrementally or all at once. A delete is just as efficient as an add -- it's effectively a constant time operation.

Aaron Novstrup 2010-09-18 00:39:38

Answer 2

+8 A:

The core Immutable data structures are one of the most fascinating parts of the language for me also. their is a lot to answering this question and Rich does a really great job of it in this video:

http://blip.tv/file/707974

The core data structures:

are actually fully immutable
the old copies are also immutable
performance does not degrade for the old copies
access is constant (actually bounded <= a constant)
all support efficient appending, concatenating (except lists and seqs) and chopping

How do they do this???

the secret: it's pretty much all trees under the hood (actually a trie).

But what if i really want to edit somthing in place?

you can use clojure's transients to edit a structure in place and then produce a immutable version (in constant time) when you are ready to share it.

as a little background: a Trie is a tree where all the common elements of the key are hoisted up to the top of the tree. the sets and maps in clojure use trie where the indexes are a hash of the key you are looking for. it then breaks the hash up into small chunks and uses each chunk as the key to one level of the hash-trie. This allows the common parts of both the new and old maps to be shared and the access time is bounded because there can only be a fixed number of branches because the hash used as in input has a fixed size.

Using these hash tries also helps prevent big slowdowns during the re-balancing used by a lot of other persistent data structures. so you will actually get fairly constant wall-clock-access-time.

I really reccomend the (relativly short)_ book: Purely Functional Data Structures In it he covers a lot of really interesting structures and concepts like "removing amortization" to allow true constant time access for queues. and things like lazy-persistent queues. the author even offers a free copy in pdf here

Arthur Ulfeldt 2010-09-09 22:31:08

Is a "trei" different to a trie or is it just a typo?

poolie 2010-09-09 22:56:13

its trei (pronounced try) I just had a bit of keyboard trouble :)

Arthur Ulfeldt 2010-09-09 23:07:38

From the wikipedia article: "The term trie comes from 'retrieval.' Following the etymology, the inventor, Edward Fredkin, pronounces it 'tree'. However, it is pronounced 'try' by other authors."

Aaron Novstrup 2010-09-10 00:01:51

ansaurus

tags:

views:

answers:

Help me understand how the conflict between immutability and running time is handled in Clojure.

related questions