tags:

views:

1185

answers:

5

This is something that should be very simple. I just want to read numbers and words from a text file that consists of tokens separated by white space. How do you do this in C#? For example, in C++, the following code would work to read an integer, float, and word. I don't want to have to use a regex or write any special parsing code.

ifstream in("file.txt");
int int_val;
float float_val;
string string_val;
in >> int_val >> float_val >> string_val;
in.close();

Also, whenever a token is read, no more than one character beyond the token should be read in. This allows further file reading to depend on the value of the token that was read. As a concrete example, consider

string decider;
int size;
string name;

in >> decider;
if (decider == "name")
    in >> name;
else if (decider == "size")
    in >> size;
else if (!decider.empty() && decider[0] == '#')
    read_remainder_of_line(in);

Parsing a binary PNM file is also a good example of why you would like to stop reading a file as soon as a full token is read in.

+6  A: 
using (FileStream fs = File.OpenRead("file.txt"))
{
    BinaryReader reader = new BinaryReader(fs);

    int intVal = reader.ReadInt32();
    float floatVal = reader.ReadSingle();
    string stringVal = reader.ReadString();
}
Brannon
That's reading from a *binary* file rather than a text file really. In particular, a file with contents "10 10.5 hello" won't be read to what you might expect. This may well match the C++ behaviour though, I'm not sure... It could well be that the OP is just misusing the phrase "text file".
Jon Skeet
I had already found this way of reading binary files. This is not what I want.
Joe
hmm, yes .. i misunderstood. I didn't realize C++ i/o streams would handle parsing from text like that.
Brannon
A: 

Try someting like this:

http://stevedonovan.blogspot.com/2005/04/reading-numbers-from-file-in-c.html

IMHO Maybe to read a c# tutorial it will be really useful to have the whole picture in mind before asking

javier
A: 

Not exactly the answer to your question, but just an idea to consider if you are new to C#: If you are using a custom text file to read some configuration parameters, you might want to check XML serialization topics in .NET.

XML serialization provides a simple way to write and read XML formatted files. For example, if you have a configuration class like this:

public class Configuration
{
   public int intVal { get; set; }
   public float floatVal { get; set; }
   public string stringVal { get; set; }
}

you can simply save it and load it using the XmlSerializer class:

public void Save(Configuration config, string fileName)
{
   XmlSerializer xml = new XmlSerializer(typeof(Configuration));
   using (StreamWriter sw = new StreamWriter(fileName))
   {
       xml.Serialize(sw, config);
   }
}

public Configuration Load(string fileName)
{
   XmlSerializer xml = new XmlSerializer(typeof(Configuration));
   using (StreamReader sr = new StreamReader(fileName)))
   {
       return (Configuration)xml.Deserialize(sr);
   }
}

Save method as defined above will create a file with the following contents:

<Configuration>
    <intVal>0</intVal>
    <floatVal>0.0</floatVal>
    <stringVal></stringVal>
</Configuration>

Good thing about this approach is that you don't need to change the Save and Load methods if your Configuration class changes.

Groo
+7  A: 

Brannon's answer explains how to read binary data. If you want to read text data, you should be reading strings and then parsing them - for which there are built-in methods, of course.

For example, to read a file with data:

10
10.5
hello

You might use:

using (TextReader reader = File.OpenText("test.txt"))
{
    int x = int.Parse(reader.ReadLine());
    double y = double.Parse(reader.ReadLine());
    string z = reader.ReadLine();
}

Note that this has no error handling. In particular, it will throw an exception if the file doesn't exist, the first two lines have inappropriate data, or there are less than two lines. It will leave a value of null in z if the file only has two lines.

For a more robust solution which can fail more gracefully, you would want to check whether reader.ReadLine() returned null (indicating the end of the file) and use int.TryParse and double.TryParse instead of the Parse methods.

That's assuming there's a line separator between values. If you actually want to read a string like this:

10 10.5 hello

then the code would be very similar:

using (TextReader reader = File.OpenText("test.txt"))
{
    string text = reader.ReadLine();
    string[] bits = text.Split(' ');
    int x = int.Parse(bits[0]);
    double y = double.Parse(bits[1]);
    string z = bits[2];
}

Again, you'd want to perform appropriate error detection and handling. Note that if the file really just consisted of a single line, you may want to use File.ReadAllText instead, to make it slightly simpler. There's also File.ReadAllLines which reads the whole file into a string array of lines.

EDIT: If you need to split by any whitespace, then you'd probably be best off reading the whole file with File.ReadAllText and then using a regular expression to split it. At that point I do wonder how you represent a string containing a space.

In my experience you generally know more about the format than this - whether there will be a line separator, or multiple values in the same line separated by spaces, etc.

I'd also add that mixed binary/text formats are generally unpleasant to deal with. Simple and efficient text handling tends to read into a buffer, which becomes problematic if there's binary data as well. If you need a text section in a binary file, it's generally best to include a length prefix so that just that piece of data can be decoded.

Jon Skeet
This is a long and thought out response, but does not adequately match the behavior of the C++ code because it makes too many assumptions about the formatting of the file. All I want is a separation by white space of the tokens. The proposed solution would not handle, for example, 10 10.5 helloReadAllText and ReadAllLines do not really behave as desired either. Take for example the parsing of a binary PNM file. There is a header with white space separated tokes that is later followed by binary data. The reader should only eat one token at a time, and leave the remaining file alone.
Joe
Formatting of the file didn't work properly. Imagine 10 and 10.5 are on one line and the token 'hello' is on a line by itself.
Joe
A: 

I like using the StreamReader for quick and easy file access. Something like....

  String file = "data_file.txt";    
  StreamReader dataStream = new StreamReader(file);   
  string datasample;
  while ((datasample = dataStream.ReadLine()) != null)
  {

     // datasample has the current line of text - write it to the console.
     Console.Writeline(datasample);
  }

-Paul

Paul
Note that you haven't closed the file there - you should have a `using` statement (or try/finally if you must). I also find that using `File.OpenText` is slightly simpler than calling the `StreamReader` constructor.
Jon Skeet
Thanks for the comment, I'll have a look at File.OpenText. I'm in StreamReader rut.
Paul