tags:

views:

362

answers:

5

C#: What's an effecient way to parse a string with one delimiter for each ReadLine() of TextReader?

My objective is to load a list of proxies to ListView into two columns (Proxy|Port) reading from a .txt file. How would I go upon splitting each readline() into the proxy and port variables with the delimiter ":"?

This is what I've got so far,

    public void loadProxies(string FilePath)
    {
        string Proxy; // example/temporary place holders
        int Port; // updated at each readline() loop.

        using (TextReader textReader = new StreamReader(FilePath))
        {
            string Line;
            while ((Line = textReader.ReadLine()) != null)
            {
                // How would I go about directing which string to return whether
                // what's to the left of the delimiter : or to the right?
                //Proxy = Line.Split(':');
                //Port = Line.Split(':');

                // listview stuff done here (this part I'm familiar with already)
            }
        }
    }

If not, is there a more efficient way to do this?

+2  A: 

You could split them this way:

        string line;
        string[] tokens;
        while ((Line = textReader.ReadLine()) != null)
        {
            tokens = line.Split(':');
            proxy = tokens[0];
            port = tokens[1];

            // listview stuff done here (this part I'm familiar with already)
        }

it's best practise to use small letter names for variables in C#, as the other ones are reserved for class / namespace names etc.

henchman
+2  A: 
string [] parts = line.Split(':');
string proxy = parts[0];
string port = parts[1];
RandomNoob
A: 

You might want to try something like this.

var items = File.ReadAllText(FilePath)
    .Split(new[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries)
    .Select(line => line.Split(':'))
    .Select(pieces => new { 
        Proxy = pieces[0], 
        Port = int.Parse(pieces[1]) 
    });

If you know that you won't have a stray newline at the end of the file you can do this.

var items = File.ReadAllLines(FilePath)
    .Select(line => line.Split(':'))
    .Select(pieces => new { 
        Proxy = pieces[0], 
        Port = Convert.ToInt32(pieces[1]) 
    });
ChaosPandion
Keep in mind that the use of `ReadAllText` and `ReadAllFiles` reads the whole file into memory at once. It's unlikely to matter here, but in general it might be better to use an iterator like in spender's answer.
Gabe
@gabe - You are correct, I figured they would want to see all of their options.
ChaosPandion
No reason not to show the options, just make sure to note the downside. The user was looking for the most efficient way, and ReadAll is only efficient for small files.
Gabe
+2  A: 

How about running a Regex on the whole file?

var parts=
    Regex.Matches(input, @"(?<left>[^:]*):(?<right>.*)",RegexOptions.Multiline)
    .Cast<Match>()
    .Where(m=>m.Success)
    .Select(m => new
        {
            left = m.Groups["left"],
            right = m.Groups["right"]
        });

foreach(var part in parts)
{
    //part.left
    //part.right
}

Or, if it's too big, why not Linqify the ReadLine operation with yielding method?

static IEnumerable<string> Lines(string filename)
{
    using (var sr = new StreamReader(filename))
    {
        while (!sr.EndOfStream)
        {
            yield return sr.ReadLine();
        }
    }
}

And run it like so:

var parts=Lines(filename)
.Select(
    line=>Regex.Match(input, @"(?<left>[^:]*):(?<right>.*)")
)
.Where(m=>m.Success)
.Select(m => new
    {
        left = m.Groups["left"],
        right = m.Groups["right"]
    });
foreach(var part in parts)
{
    //part.left
    //part.right
}
spender
Just take note that if you are using .NET 4.0 you won't need to create that Lines method.
ChaosPandion
Good to know. Haven't had the time to get grubby with .net4.
spender
For a fairly simply operation, personally I think this is overly complex. I'm still +1'ing it because it's a good answer.
Alastair Pitts
Isn't linq cool? I'm always looking for novel ways to turn everything into a sequence!
spender
I notice you have no F# tags under your belt. If you love sequences you should probably remedy that.
ChaosPandion
Halfway through Thomas P's book right now.
spender
+1  A: 

In terms of efficiency I expect you'd be hard-pressed to beat:

    int index = line.IndexOf(':');
    if (index < 0) throw new InvalidOperationException();
    Proxy = line.Substring(0, index);
    Port = int.Parse(line.Substring(index + 1));

This avoids the array construction / allocation associated with Split, and only looks as far as the first delimited. But I should stress that this is unlikely to be a genuine performance bottleneck unless the data volume is huge, so pretty-much any approach should be fine. In fact, perhaps the most important thing (I've been reminded by the comment below) is to suspend the UI while adding:

myListView.BeginUpdate();
try {
    // TODO: add all the items here
} finally {
    myListView.EndUpdate();
}
Marc Gravell
Yes, no doubt the ListView would become a bottleneck well before you'd be able to tell the difference between `Split` and `Substring`.
Gabe
@gabe - good point; added update re making the `ListView` faster.
Marc Gravell