ansaurus

Question

How would you split by \r\n if String.Split(String[]) did not exist?

Answer 1

A:

What about:

string path = "yourfile.txt";
string[] lines = File.ReadAllLines(path);

Or

string content = File.ReadAllText(path);
string[] lines = content.Split(
    Environment.NewLine.ToCharArray(),
    StringSplitOptions.RemoveEmptyEntries);

Readind that .NET Micro Framework 3.0, this code can work:

string line = String.Empty;
StreamReader reader = new StreamReader(path);
while ((line = reader.ReadLine()) != null)
{
    // do stuff
}

Rubens Farias 2010-01-10 22:36:53

It splits by both items in the CharArray separately. Thus, you'd get a bunch of empty results in addition to legit results.

AngryHacker 2010-01-10 22:43:06

@AngryHacker: RemoveEmptyEntries should deal with them;

Rubens Farias 2010-01-10 22:54:00

File.ReadAllLines is not supported either.

AngryHacker 2010-01-10 22:58:53

See the link the provided in the question for documentation on what's supported.

AngryHacker 2010-01-10 23:00:01

Answer 2

+5 A:

I would loop across each char in the document, because that's clearly required. How do you think String.Split works? I would try to do so only hitting each character once, however.

Keep a list of strings found so far. Use IndexOf repeatedly, passing in the current offset into the string (i.e. the previous match + 2).

Jon Skeet 2010-01-10 22:38:08

True, but the full .NET implementation of Split(string[]) uses pointers and unsafe code to achieve the performance that it does. Otherwise, I would simply copy the code. I am operating on a really low end chip and was hoping for something exceedingly clever.

AngryHacker 2010-01-10 22:57:33

If you're only splitting by a single delimiter string, and the delimiter is longer than 3 character, you can get better performance using one of the Boyer-Moore search algorithm variants. Not that it helps the poster's problem, in this case. http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

LBushkin 2010-01-10 23:09:42

Answer 3

+2 A:

How can I split a document into lines in an efficient manner (e.g. not looping madly across each char in the doc)?

How do you think the built-in Split works?

Just reimplement it yourself as an extension method.

Anon. 2010-01-10 22:38:21

Answer 4

+4 A:

You can do

string[] lines = doc.Split('\n');
for (int i = 0; i < lines.Length; i+= 1)
   lines[i] = line[i].Trim();

Assuming that the µF supports Trim() at all. Trim() will remove all whitespace, that might be useful. Also consider TrimEnd('\r')

Henk Holterman 2010-01-10 22:44:06

Yep, this does the trick and keeps most of the performance characteristics. Thanks. Pretty clever.

AngryHacker 2010-01-10 23:05:43

I was just about to suggest something like this - beat me to the punch. You may need to recombine strings (theoretically) if '\n' is not always preceded by '\r' in the input.

LBushkin 2010-01-10 23:13:29

ansaurus

tags:

views:

answers:

How would you split by \r\n if String.Split(String[]) did not exist?

related questions