views:

160

answers:

2

I've found myself using the following idiom lately in clojure code.

(def *some-global-var* (ref {}))

(defn get-global-var []
  @*global-var*)

(defn update-global-var [val]
  (dosync (ref-set *global-var* val)))

Most of the time this isn't even multi-threaded code that might need the transactional semantics that refs give you. It just feels like refs are for more than threaded code but basically for any global that requires immutability. Is there a better practice for this? I could try to refactor the code to just use binding or let but that can get particularly tricky for some applications.

+1  A: 

I always use an atom rather than a ref when I see this kind of pattern - if you don't need transactions, just a shared mutable storage location, then atoms seem to be the way to go.

e.g. for a mutable map of key/value pairs I would use:

(def state (atom {}))

(defn get-state [key]
  (@state key))

(defn update-state [key val]
  (swap! state assoc key val))
mikera
+4  A: 

Your functions have side effects. Calling them twice with the same inputs may give different return values depending on the current value of *some-global-var*. This makes things difficult to test and reason about, especially once you have more than one of these global vars floating around.

People calling your functions may not even know that your functions are depending on the value of the global var, without inspecting the source. What if they forget to initialize the global var? It's easy to forget. What if you have two sets of code both trying to use a library that relies on these global vars? They are probably going to step all over each other, unless you use binding. You also add overheads every time you access data from a ref.

If you write your code side-effect free, these problems go away. A function stands on its own. It's easy to test: pass it some inputs, inspect the outputs, they'll always be the same. It's easy to see what inputs a function depends on: they're all in the argument list. And now your code is thread-safe. And probably runs faster.

It's tricky to think about code this way if you're used to the "mutate a bunch of objects/memory" style of programming, but once you get the hang of it, it becomes relatively straightforward to organize your programs this way. Your code generally ends up as simple as or simpler than the global-mutation version of the same code.

Here's a highly contrived example:

(def *address-book* (ref {}))

(defn add [name addr]
  (dosync (alter *address-book* assoc name addr)))

(defn report []
  (doseq [[name addr] @*address-book*]
    (println name ":" addr)))

(defn do-some-stuff []
  (add "Brian" "123 Bovine University Blvd.")
  (add "Roger" "456 Main St.")
  (report))

Looking at do-some-stuff in isolation, what the heck is it doing? There are a lot of things happening implicitly. Down this path lies spaghetti. An arguably better version:

(defn make-address-book [] {})

(defn add [addr-book name addr]
  (assoc addr-book name addr))

(defn report [addr-book]
  (doseq [[name addr] addr-book]
    (println name ":" addr)))

(defn do-some-stuff []
  (let [addr-book (make-address-book)]
    (-> addr-book
        (add "Brian" "123 Bovine University Blvd.")
        (add "Roger" "456 Main St.")
        (report))))

Now it's clear what do-some-stuff is doing, even in isolation. You can have as many address books floating around as you want. Multiple threads could have their own. You can use this code from multiple namespaces safely. You can't forget to initialize the address book, because you pass it as an argument. You can test report easily: just pass the desired "mock" address book in and see what it prints. You don't have to care about any global state or anything but the function you're testing at the moment.

If you don't need to coordinate updates to a data structure from multiple threads, there's usually no need to use refs or global vars.

Brian Carper
I'm no stranger to the functional approach you describe. But sometimes the convenience of a global location of that holds state is useful.All functional approaches break down at the edges the most often seen case being IO. You could consider this a special case of IO since it is effectively global to all threads.Don't get me wrong I prefer the functional approach and my example useage of ref above is an overly simplistic one so I for the most part agree with you.
Jeremy Wall