views:

983

answers:

7

Hello everybody. I am looking for fast class for to work with text files and comfortable reading different object (methods like NextInt32, NextDouble, NextLine, etc). Can you advice me something?

Edit: BinaryReader is bad class in my case. Format of my data is not binary. I have file like

1 2 3
FirstToken NextToken
1.23 2,34

And I want read this file with code like:

int a = FileReader.NextInt32();
int b = FileReader.NextInt32();
int c = FileReader.NextInt32();
int d = FileReader.NextString();
int e = FileReader.NextString();
int f = FileReader.NextDouble();
int g = FileReader.NextDouble();

Edit2: I am looking for analog Scanner from Java

+2  A: 

Have you checked out the BinaryReader class? Yes it's a text file but there is nothing stopping you from treating it as binary data and hence using BinaryReader. It has all of the methods that you are looking for with the exception of ReadLine. However it wouldn't be too difficult to implement that method on top of BinaryReader.

JaredPar
Agreed, except that implementing ReadLine on top of `BinaryReader` *would* be some task, if you design it to handle arbitrary encodings and not interrupt the reading of binary data.
Noldorin
BinaryReader will read doubles from Text?
Henk Holterman
@Henk, yes. It just sees a stream of bytes. As long as the encoding is correct it will read them.
JaredPar
@Noldorin, yeah making it Encoding agnostic would be a bit tricky.
JaredPar
@JaredPar: Using BinaryReader.ReadDouble won't read *text* and parse it though. It will treat that text as bytes, which isn't the idea.
Jon Skeet
BinaryReader is bad class for me. Look to my edit.
DreamWalker
Jaredpar, that is not what I would call a __Text__ file. The BinaryReader is more usefull, but only if the OP owns the format.
Henk Holterman
Which apparently he doesn't.
Henk Holterman
A: 

The System.IO.BinaryReader class is what you need.

Example of implementation of a ReadLine method:

public static class Extensions
{
    public static String ReadLine(this BinaryReader binaryReader)
    {
        var bytes = new List<Byte>();
        byte temp;

        while ((temp = (byte)binaryReader.Read()) < 10)
            bytes.Add(temp);

        return Encoding.Default.GetString(bytes.ToArray());
    }
}

Example for using this class:

using System;
using System.IO;
using System.Security.Permissions;

class Test
{
    static void Main()
    {
        // Load application settings.
        AppSettings appSettings = new AppSettings();
        Console.WriteLine("App settings:\nAspect Ratio: {0}, " +
            "Lookup directory: {1},\nAuto save time: {2} minutes, " +
            "Show status bar: {3}\n",
            new Object[4]{appSettings.AspectRatio.ToString(),
            appSettings.LookupDir, appSettings.AutoSaveTime.ToString(),
            appSettings.ShowStatusBar.ToString()});

        // Change the settings.
        appSettings.AspectRatio   = 1.250F;
        appSettings.LookupDir     = @"C:\Temp";
        appSettings.AutoSaveTime  = 10;
        appSettings.ShowStatusBar = true;

        // Save the new settings.
        appSettings.Close();
    }
}

// Store and retrieve application settings.
class AppSettings
{
    const string fileName = "AppSettings#@@#.dat";
    float  aspectRatio;
    string lookupDir;
    int    autoSaveTime;
    bool   showStatusBar;

    public float AspectRatio
    {
        get{ return aspectRatio; }
        set{ aspectRatio = value; }
    }

    public string LookupDir
    {
        get{ return lookupDir; }
        set{ lookupDir = value; }
    }

    public int AutoSaveTime
    {
        get{ return autoSaveTime; }
        set{ autoSaveTime = value; }
    }

    public bool ShowStatusBar
    {
        get{ return showStatusBar; }
        set{ showStatusBar = value; }
    }

    public AppSettings()
    {
        // Create default application settings.
        aspectRatio   = 1.3333F;
        lookupDir     = @"C:\AppDirectory";
        autoSaveTime  = 30;
        showStatusBar = false;

        if(File.Exists(fileName))
        {
            BinaryReader binReader =
                new BinaryReader(File.Open(fileName, FileMode.Open));
            try
            {
                // If the file is not empty,
                // read the application settings.
                // First read 4 bytes into a buffer to
                // determine if the file is empty.
                byte[] testArray = new byte[3];
                int count = binReader.Read(testArray, 0, 3);

                if (count != 0)
                {
                    // Reset the position in the stream to zero.
                    binReader.BaseStream.Seek(0, SeekOrigin.Begin);

                    aspectRatio   = binReader.ReadSingle();
                    lookupDir     = binReader.ReadString();
                    autoSaveTime  = binReader.ReadInt32();
                    showStatusBar = binReader.ReadBoolean();
                }
            }

            // If the end of the stream is reached before reading
            // the four data values, ignore the error and use the
            // default settings for the remaining values.
            catch(EndOfStreamException e)
            {
                Console.WriteLine("{0} caught and ignored. " +
                    "Using default values.", e.GetType().Name);
            }
            finally
            {
                binReader.Close();
            }
        }

    }

    // Create a file and store the application settings.
    public void Close()
    {
        using(BinaryWriter binWriter =
            new BinaryWriter(File.Open(fileName, FileMode.Create)))
        {
            binWriter.Write(aspectRatio);
            binWriter.Write(lookupDir);
            binWriter.Write(autoSaveTime);
            binWriter.Write(showStatusBar);
        }
    }
}
TTT
That's a *very* strange implementation of ReadLine, as well as using the default encoding rather than the encoding of the binary reader. How would you expect it to ever return any printable text?
Jon Skeet
Additionally - *Binary*Reader isn't appropriate for *text* files.
Jon Skeet
BinaryReader is bad class for me. Look to my edit.
DreamWalker
How can I get the default encoding of the binary reader?
TTT
You'd call ReadChar instead of reading individual bytes.
Jon Skeet
A: 

You can probably use the System.IO.File Class to read the file and System.Convert to parse the strings you read from the file.

string line = String.Empty;
while( (line = file.ReadLine()).IsNullOrEmpty() == false )
{
   TYPE value = Convert.ToTYPE( line );
}

Where TYPE is whatever type you're dealing with at that particular line / file.

If there are multiple values on one line you could do a split and read the individual values e.g.

string[] parts = line.Split(' ');
if( parts.Length > 1 )
{
   foreach( string item in parts )
   {
      TYPE value = Convert.ToTYPE( item );
   }
}
else
{
   // Use the code from before
}
TJB
This will treat each line as a single value. He wants a _single_ line `foo 123 bar` to be treated as 3 distinct values.
Pavel Minaev
@Pavel Minaev Added support for that, thanx for the suggestion
TJB
+4  A: 

You should define exactly what your file format is meant to look like. How would you represent a string with a space in it? What determines where the line terminators go?

In general you can use TextReader and its ReadLine method, followed by double.TryParse, int.TryParse etc - but you'll need to pin the format down more first.

Jon Skeet
To add to this, one could add a decorator class for `TextReader` to provide convenience `ReadInt32` etc methods; or extension methods on `TextReader` itself. I'm not aware of any such stock classes, however.
Pavel Minaev
Pavel is right. I'm looking for just such a class
DreamWalker
@DreamWalker: the reason why there isn't any standard class for this is that there's a multitude of slightly different text-based formats with no clear standard: whitespace or commas for separators; quoted strings optional/required/not supported; etc. The most common one is CSV (http://en.wikipedia.org/wiki/Comma-separated_values), and even then there are variations; but at least if you had CSV, there are quite a few premade parsers out there. If you want something different, you will most likely have to write it yourself.
Pavel Minaev
@Pavel Minaev: I was hoping that there is already written class. It is not necessarily standart class. It can be class from some open-source framework or something like it
DreamWalker
+3  A: 

I'm going to add this as a separate answer because it's quite distinct from the answer I already gave. Here's how you could start creating your own Scanner class:

class Scanner : System.IO.StringReader
{
  string currentWord;

  public Scanner(string source) : base(source)
  {
     readNextWord();
  }

  private void ReadNextWord()
  {
     System.Text.StringBuilder sb = new StringBuilder();
     char nextChar;
     int next;
     do
     {
        next = this.Read();
        if (next < 0)
           break;
        nextChar = (char)next;
        if (char.IsWhiteSpace(nextChar))
           break;
        sb.Append(nextChar);
     } while (true);
     while((this.Peek() >= 0) && (char.IsWhiteSpace((char)this.Peek())))
        this.Read();
     if (sb.Length > 0)
        currentWord = sb.ToString();
     else
        currentWord = null;
  }

  public bool HasNextInt()
  {
     if (currentWord == null)
        return false;
     int dummy;
     return int.TryParse(currentWord, out dummy);
  }

  public int NextInt()
  {
     try
     {
        return int.Parse(currentWord);
     }
     finally
     {
        readNextWord();
     }
  }

  public bool HasNextDouble()
  {
     if (currentWord == null)
        return false;
     double dummy;
     return double.TryParse(currentWord, out dummy);
  }

  public double NextDouble()
  {
     try
     {
        return double.Parse(currentWord);
     }
     finally
     {
        readNextWord();
     }
  }

  public bool HasNext()
  {
     return currentWord != null;
  }
}
TTT
+4  A: 

I believe this extension method for TextReader would do the trick:

public static class TextReaderTokenizer
{
    // Adjust as needed. -1 is EOF.
    private static int[] whitespace = { -1, ' ', '\r' , '\n', '\t' };

    public static T ReadToken<T>(this TextReader reader)
    {
        StringBuilder sb = new StringBuilder();
        while (Array.IndexOf(whitespace, reader.Peek()) < 0)
        {
            sb.Append((char)reader.Read());
        }
        return (T)Convert.ChangeType(sb.ToString(), typeof(T));
    }    
}

It can be used thus:

TextReader reader = File.OpenText("foo.txt");
int n = reader.ReadToken<int>();
string s = reader.ReadToken<string>();

[EDIT] As requested in question comments, here's an instance wrapper version of the above that is parametrized with delimiters and CultureInfo:

public class TextTokenizer
{
    private TextReader reader;
    private Predicate<char> isDelim;
    private CultureInfo cultureInfo;

    public TextTokenizer(TextReader reader, Predicate<char> isDelim, CultureInfo cultureInfo)
    {
        this.reader = reader;
        this.isDelim = isDelim;
        this.cultureInfo = cultureInfo;
    }

    public TextTokenizer(TextReader reader, char[] delims, CultureInfo cultureInfo)
    {
        this.reader = reader;
        this.isDelim = c => Array.IndexOf(delims, c) >= 0;
        this.cultureInfo = cultureInfo;
    }

    public TextReader BaseReader
    {
        get { return reader; }
    }

    public T ReadToken<T>()
    {
        StringBuilder sb = new StringBuilder();
        while (true)
        {
            int c = reader.Peek();
            if (c < 0 || isDelim((char)c))
            {
                break;
            }
            sb.Append((char)reader.Read());
        }
        return (T)Convert.ChangeType(sb.ToString(), typeof(T));
    }    
}

Sample usage:

TextReader reader = File.OpenText("foo.txt");
TextTokenizer tokenizer = new TextTokenizer(
    reader,
    new[] { ' ', '\r', '\n', '\t' },
    CultureInfo.InvariantCulture);
int n = tokenizer.ReadToken<int>();
string s = tokenizer.ReadToken<string>();
Pavel Minaev
A: 

You don't really say what you're ultimately trying to accomplish here. But if you have any control at all over the file format, you might consider XML Serialization rather than trying to roll your own scanning/parsing/converting schemes.

Dan