views:

72

answers:

1

I know this is asked commonly but googling doesn't turn up a definitive answer for Mathematica so I thought it would be valuable to have this on StackOverflow.

I've been doing this with Import but it occurred to me that that might be horribly inefficient, Import being such a heavyweight function.

So the question is, can you improve on the following:

slurp[filename_] := Import[filename, "Text"]
+2  A: 

For importing the entire file at once, the only other option that I am aware of is ReadList. It can be coaxed to returning the entire file as a single string as follows1:

In[1]:= ReadList["ExampleData/source", Record, RecordSeparators -> {}]
Out[1]:= {"f[x] (: function f :)\r\ng[x] (: function g :)\r\n"}

(Note: \r and \n are actually interpreted in the output, but I left them in for readability.) The key is to remove any RecordSeparators. But, I honestly don't think this saves you anything, and Import[ <file>, "Text"] is easier to write. Truthfully, I use Read[ <file>, String] when I have data in a format that isn't covered by the type specifiers used in Read and ReadList, and build a custom function around this operation to load in all of the data.


  1. You can find this in the Reading Textual Data tutorial.
rcollyer
Import[file, "String"] uses this ReadList[] syntax to suck in the file. Speed wise, these should be very similar. The "Text" format does line ending normalization and maybe handles character encodings.
Joshua Martell
@Joshua, having not seen the underlying code myself, I suspect you are correct about their relative speeds. That being said, the `Import` syntax for loading a entire file into a string is much simpler, so it is less likely that bugs will be introduced through its use. On the other hand, my data is often structured, but not in a way that is handled by `Import`, `Read`, or `ReadList`. So, I tend to parse the files as I go, as opposed to loading them all at once and then parsing.
rcollyer
@rcollyer, your use of `Read` and `ReadList` sound very sensible. They both take lists of data types (like a struct) that you might also find useful. Same for their Binary counterparts.
Joshua Martell
@Joshua, true, and I use that when I can. But, if some of the text is markup, and not, strictly speaking, data, then it is less useful. Personally, I would like to be able to define file formats that `Import`/`Export` could use, so that file formats I use can be treated on par with the built-in types.
rcollyer