views:

406

answers:

4

Here's the brief problem:
Input: a list of strings, each containing numbers
(" 3.4 5.4 1.2 6.4" "7.8 5.6 4.3" "1.2 3.2 5.4")

Output: a list of numbers
(3.4 5.4 1.2 6.4 7.8 5.6 4.3 1.2 3.2 5.4)

Here's my attempt at coding this:

(defun parse-string-to-float (line &optional (start 0))
  "Parses a list of floats out of a given string"
  (if (equalp "" line)
    nil
    (let ((num (multiple-value-list (read-from-string (subseq line start)))))
      (if (null (first num))
        nil
        (cons (first num) (parse-string-to-float (subseq line (+ start (second num)))))))))

(defvar *data* (list "  3.4 5.4 1.2 6.4" "7.8 5.6 4.3" "1.2 3.2 5.4"))

(setf *data* (format nil "~{~a ~}" *data*))

(print (parse-string-to-float *data*))

===> (3.4 5.4 1.2 6.4 7.8 5.6 4.3 1.2 3.2 5.4)

However, for rather large data sets, it's a slow process. I'm guessing the recursion isn't as tight as possible and I'm doing something unnecessary. Any ideas?

Furthermore, the grand project involves taking an input file that has various data sections separated by keywords. Example -

%FLAG START_COORDS
1   2   5   8   10   12  
%FLAG END_COORDS  
3   7   3   23   9   26
%FLAG NAMES
ct  re  ct  cg  kl   ct

etc... I'm trying to parse a hash-table with the keywords that follow %FLAG as the keys, and the values stored as number or string lists depending on the particular keyword I'm parsing. Any ideas for libraries that already do this very type of job, or simple ways around this in lisp?

+6  A: 

This is not a task you want to be doing recursively to begin with. Instead, use LOOP and a COLLECT clause. For example:

(defun parse-string-to-floats (line)
  (loop
    :with n := (length line)
    :for pos := 0 :then chars
    :while (< pos n)
    :for (float chars) := (multiple-value-list
      (read-from-string line nil nil :start pos))
    :collect float))

Also, you might want to consider using WITH-INPUT-FROM-STRING instead of READ-FROM-STRING, which makes things even simpler.

(defun parse-string-to-float (line)
  (with-input-from-string (s line)
    (loop
      :for num := (read s nil nil)
      :while num
      :collect num)))

As for performance, you might want to do some profiling, and ensure that you are actually compiling your function.

Pillsy
with-input-from-string? Wow! Learn something new every day.
khedron
+2  A: 

Also for performance, try

(declare (optimize (speed 3)))

inside your defun. Some lisps (for example SBCL) will print helpful messages about where it could not optimize, and the estimated cost of not having this optimization

johanbev
+2  A: 

As for performance, try at least measuring the memory allocation. I guess that all the performance is eaten by memory allocation and GC: you allocate a lot of big strings with subseq. E.g., (time (parse-string-to-float ..)) will show you how much time is spent in your code, how much in GC and how much memory was allocated.

If this is the case, then use string-stream (like in with-input-from-string) to decrease GC pressure.

dmitry_vk
+5  A: 

Parse a single string:

(defun parse-string-to-floats (string)
  (let ((*read-eval* nil))
    (with-input-from-string (stream string)
      (loop for number = (read stream nil nil)
            while number collect number))))

Process a list of strings and return a single list:

(defun parse-list-of-strings (list)
  (mapcan #'parse-string-to-floats list))

Example:

CL-USER 114 > (parse-list-of-strings (list "1.1 2.3 4.5" "1.17 2.6 7.3"))
(1.1 2.3 4.5 1.17 2.6 7.3)

Note:

A costly operation is READ to read float values from streams. There are libraries like PARSE-NUMBER that might be more efficient - some Common Lisp implementation also might have the equivalent of a READ-FLOAT / PARSE-FLOAT function.

Rainer Joswig