tags:

views:

87

answers:

1

Stuart Halloway gives the example

(re-seq #"\w+" "The quick brown fox")

as the natural method for finding matches of regex matches in Clojure. In his book this construction is contrasted with iteration over a matcher. If all one cared about were a list of matches this would be great. However, what if I wanted matches and their position within the string? Is there a better way of doing this that allows me to leverage the existing functionality in java.util.regex with resorting to something like a sequence comprehension over each index in the original string? In other words, one would like to type something like

(re-seq-map #"[0-9]+" "3a1b2c1d")

which would return a map with keys as the position and values as the matches, e.g.

{0 "3", 2 "1", 4 "2", 6 "1"}

Is there some implementation of this in an extant library already or shall I write it (shouldn't be too may lines of code)?

+2  A: 

You can fetch the data you want out of a java.util.regex.Matcher object.

user> (defn re-pos [re s]
        (loop [m (re-matcher re s)
               res {}]
          (if (.find m)
            (recur m (assoc res (.start m) (.group m)))
            res)))
#'user/re-pos
user> (re-pos #"\w+" "The quick brown fox")
{16 "fox", 10 "brown", 4 "quick", 0 "The"}
user> (re-pos #"[0-9]+" "3a1b2c1d")
{6 "1", 4 "2", 2 "1", 0 "3"}
Brian Carper
Thanks Brian. Maybe re-pos should find its way into the regex library.
Gabriel Mitchell