I just keep the visited set as a set and pass it as a parameter.  There are efficient log-time implementations of sets of any ordered type and extra-efficient sets of integers.
To represent a graph I use adjacency lists, or I'll use a finite map that maps each node to a  list of its successors.  It depends what I want to do.
Rather than Abelson and Sussman, I recommend Chris Okasaki's Purely Functional Data Structures.  I've linked to Chris's dissertation, but if you have the money, he expanded it into an excellent book.
Just for grins, here's a slightly scary reverse postorder depth-first search done in continuation-passing style in Haskell.  This is straight out of the Hoopl optimizer library:
postorder_dfs_from_except :: forall block e . (NonLocal block, LabelsPtr e)
                          => LabelMap (block C C) -> e -> LabelSet -> [block C C]
postorder_dfs_from_except blocks b visited =
 vchildren (get_children b) (\acc _visited -> acc) [] visited
 where
   vnode :: block C C -> ([block C C] -> LabelSet -> a) 
                      -> ([block C C] -> LabelSet -> a)
   vnode block cont acc visited =
        if setMember id visited then
            cont acc visited
        else
            let cont' acc visited = cont (block:acc) visited in
            vchildren (get_children block) cont' acc (setInsert id     visited)
      where id = entryLabel block
   vchildren bs cont acc visited = next bs acc visited
      where next children acc visited =
                case children of []     -> cont acc visited
                                 (b:bs) -> vnode b (next bs) acc     visited
   get_children block = foldr add_id [] $ targetLabels bloc
   add_id id rst = case lookupFact id blocks of
                      Just b -> b : rst
                      Nothing -> rst