views:

105

answers:

4

I have a string containing a valid Clojure form. I want to replace a part of it, just like with assoc-in, but processing the whole string as tokens.

=> (assoc-in [:a [:b :c]] [1 0] :new)
[:a [:new :c]]
=> (assoc-in [:a 
                [:b,, :c]] [1 0] :new)
[:a [:new :c]]
=> (string-assoc-in "[:a 
                       [:b,, :c]]" [1 0] ":new")
"[:a 
   [:new,, :c]]"

I want to write string-assoc-in. Note that its first and last arguments are strings, and it keeps the line break and the commas. Is it doable in Clojure? The closest thing I found is read which calls clojure.lang.LispReader, but I don't know how works.

I want to use it to read a Clojure source file and display it with some modifications, keeping the structure of the file.

+1  A: 

I'm assuming you don't want to actually read in a form and evaluate it? fnparse has a Clojure parser (written in Clojure using fnparse). You might be able to use that to get you from string to form, then manipulate, then put it back out to a string?

Alex Miller
+2  A: 

You could do this with a combination of (read-string) and some string manipulation:

(defn string-assoc-in
  [a b c]
  (.replaceAll
    (str
     (assoc-in (read-string (.replaceAll a ",," ",_,")) b (read-string c)))
    " _ " ",, "))

user> (string-assoc-in "[:a [:b,, :c]]" [1 0] ":new")
"[:a [:new,, :c]]"

Note that we require a reserved placeholder character (in this case, _) which you wouldn't want in your keywords. The trick is to get those ,, out of the way when the reader is crunching on the vector string, then put them back.

This sample doesn't address the newlines, but I think you could handle those in the same way.

Greg Harman
I don't follow - `(let [s "[:a [:b,, :c]]"] (string-assoc-in s [1 0] ":new"))` works fine? However, I do agree that the macro is unnecessary and that a function will work just as well (the macro was an artifact from my screwing around with solutions), so I'll edit the answer to use defn.
Greg Harman
@all: Greg is replying to a comment in which I mistakenly claimed that the above wouldn't work. I was going to replace it with an amended version -- by posting a slightly longer comment and deleting the original -- but, in a beautiful blunder, I clicked delete *first*. Sorry, not the way to go after the comment's been up for a couple of minutes. *sigh* @Greg: You're right, at any rate, sorry for the confusion.
Michał Marczyk
Upvoted this one for giving me the idea for my solution, however now I see that it exhibits the same / very similar bug which I have since discovered in my code (try e.g. `(string-assoc-in "[:a [:b,, :c,,]]" [1 0] ":new")` or `[:b,,]` or `[:b,,:c]`...). No avoiding a parser / special-purpose reader for this one, it seems.
Michał Marczyk
I like the simplicity of your trick, it gives me an idea. The actual implementation fails to work in some cases, though, see Michal's counter-example.
Adam Schmideg
+4  A: 

Or another option would be to use ANTLR to parse the Clojure code into an AST, then transform the AST, and export back out to a string.

Alex Miller
Ah, this might be the best approach... CCW's grammar is likely to be comprehensive and well-maintained (and to stay that way over time!). However, my ANTLR-fu is still too weak for me to know how to extract stuff that's been placed on the "hidden channel". I thought that the lexer sees that, but the parser doesn't...?
Michał Marczyk
I didn't know there is a Clojure grammar file for ANTLR, thanks for the pointer. I'd prefer a pure Clojure solution, though.
Adam Schmideg
+2  A: 

I think this should work, be entirely general and not require its own reader / parser:

(defn is-clojure-whitespace? [c]
  (or (Character/isSpace c)
      (= \, c)))

(defn whitespace-split
  "Returns a map of true -> (maximal contiguous substrings of s
  consisting of Clojure whitespace), false -> (as above, non-whitespace),
  :starts-on-whitespace? -> (whether s starts on whitespace)."
  [s]
  (if (empty? s)
    {}
    (assoc (group-by (comp is-clojure-whitespace? first)
                     (map (partial apply str)
                          (partition-by is-clojure-whitespace? s)))
      :starts-on-whitespace?
      (if (is-clojure-whitespace? (first s)) true false))))

(defn string-assoc-in [s coords subst]
  (let [{space-blocks true
         starts-on-whitespace? :starts-on-whitespace?}
        (whitespace-split s)
        s-obj (assoc-in (binding [*read-eval* false] (read-string s))
                        coords
                        (binding [*read-eval* false] (read-string subst)))
        {non-space-blocks false}
        (whitespace-split (pr-str s-obj))]
    (apply str
           (if starts-on-whitespace?
             (interleave space-blocks (concat non-space-blocks [nil]))
             (interleave non-space-blocks (concat space-blocks [nil]))))))

Example:

user> (string-assoc-in "[:a [:b,, :c]]" [1 0] ":new")
"[:a [:new,, :c]]"

Update: Ouch, caught a bug:

user> (string-assoc-in "[:a [:b,, :c\n]]" [1 0] ":new")
"[:a [:new,, :c]]\n"

I'd love it if it didn't matter, but I guess I'll have to try and do something about it... sigh

Michał Marczyk
I like this trick with splitting on whitespace, then interleaving again. It shows me a way how to do that without having to write a reader.
Adam Schmideg
I didn't want to write a reader. Ironically, thinking about your answer has led me to writing one.
Adam Schmideg