ansaurus

Question

Answer 1

+5 A:

You could do something like:

using (TextReader rdr = OpenYourFile()) {
    string line;
    while ((line = rdr.ReadLine()) != null) {
        string[] fields = line.Split('\t'); // THIS LINE DOES THE MAGIC
        int theInt = Convert.ToInt32(fields[1]);
    }
}

The reason you didn't find relevant result when searching for 'formatting' is that the operation you are performing is called 'parsing'.

erikkallen 2009-05-13 15:58:02

There is no , in his input data

Binary Worrier 2009-05-13 15:59:24

Down-vote removed :)

Binary Worrier 2009-05-13 16:01:20

This doesn't get "the string in the middle of each line indicating the path" (taken directly from the question).

Samir Talwar 2009-05-13 16:11:21

Alright,very useful,but How do I find the string?

John 2009-05-13 16:15:31

You may need to use line.Split("\t".ToCharArray()) depending on your version (IIRC)Take care, though. If you want to access the 15th item on the line, but the line you are working on only contains 12 items (for example) you will get an exception. Guard against this kind of thing as much as possible. Also, an empty line will throw you into disarray (no pun intended) as the line.split("\t") command will return an array with only a single, empty element.

ZombieSheep 2009-05-13 16:16:49

Answer 2

A:

Try regular expressions. You can find a certain pattern in your text and replace it with something that you want. I can't give you the exact code right now but you can test out your expressions using this.

http://www.radsoftware.com.au/regexdesigner/

Marc Vitalis 2009-05-13 15:58:09

Answer 3

+5 A:

OK, here's what we do: open the file, read it line by line, and split it by tabs. Then we grab the second integer and loop through the rest to find the path.

StreamReader reader = FileInfo.OpenText("filename.txt");
string line;
while ((line = reader.ReadLine()) != null) {
    string items[] = line.Split('\t');
    int myInteger = int.Parse(items[1]); // Here's your integer.
    // Now let's find the path.
    string path = null;
    foreach (string item in items) {
        if (item.StartsWith("item\\") && item.EndsWith(".ddj")) {
            path = item;
        }
    }

    // At this point, `myInteger` and `path` contain the values we want
    // for the current line. We can then store those values or print them,
    // or anything else we like.
}

Samir Talwar 2009-05-13 15:59:21

Thanks,I'll test it and then give feedback!

John 2009-05-13 16:16:31

Works Great,thanks!

John 2009-05-13 16:26:49

Great. I don't have a C# compiler on this machine so I had to wing it. Glad to hear it works out of the box.

Samir Talwar 2009-05-13 17:02:00

Answer 4

A:

You could open the file up and use StreamReader.ReadLine to read the file in line-by-line. Then you can use String.Split to break each line into pieces (use a \t delimiter) to extract the second number.

As the number of items is different you would need to search the string for the pattern 'item\*.ddj'.

To delete an item you could (for example) keep all of the file's contents in memory and write out a new file when the user clicks 'Save'.

Justin Ethier 2009-05-13 16:00:51

Answer 5

A:

What you want to do is write a program that will parse the file, taking out the parts you want and formatting the output in such a manner that you can easily paste it into your code (or alternatively, have your code load the file each time it runs).

If you take one of the "read a file" samples, read in each line

string line = reader.ReadLine()

string[] fields = line.split("\t"); // will give an array of strings

fields[1] & fields[0] have the data you want.

Binary Worrier 2009-05-13 16:00:54

Answer 6

+4 A:

Another solution, this time making use of regular expressions:

using System.Text.RegularExpressions;

...

StreamReader reader = FileInfo.OpenText("filename.txt");
string line;
while ((line = reader.ReadLine()) != null) {
    Match m = Regex.Match(@"^\d+\t(\d+)\t.+?\t(item\\[^\t]+\.ddj)");
    if (m.Success) {
        int myInt = int.Parse(m.Group(1).Value);
        string path = m.Group(2).Value;

        // At this point, `myInteger` and `path` contain the values we want
        // for the current line. We can then store those values or print them,
        // or anything else we like.
    }
}

That expression's a little complex, so here it is broken down:

^        Start of string
\d+      "\d" means "digit" - 0-9. The "+" means "one or more."
         So this means "one or more digits."
\t       This matches a tab.
(\d+)    This also matches one or more digits. This time, though, we capture it
         using brackets. This means we can access it using the Group method.
\t       Another tab.
.+?      "." means "anything." So "one or more of anything". In addition, it's lazy.
         This is to stop it grabbing everything in sight - it'll only grab as much
         as it needs to for the regex to work.
\t       Another tab.

(item\\[^\t]+\.ddj)
    Here's the meat. This matches: "item\<one or more of anything but a tab>.ddj"

Samir Talwar 2009-05-13 16:09:24

I dont know which of your answers to accept,both are working great.I like this one more,because you explained why and I had never seen that before!

John 2009-05-13 16:30:46

If you like regular expressions, I'd recommend using something like Perl next time you want to process files like this. It's designed around them, and you can use it to easily format your file in a way you like.

Samir Talwar 2009-05-13 17:03:29

Samir Talwar: I reckon you should become a teach regular expressions. The way you explained everything was just brilliant. I've never had a teacher that's been so detailed! +1

lucifer 2010-03-18 18:08:44

j-t-s: Cheers for the compliment, mate. I do try. :-)

Samir Talwar 2010-03-20 12:59:17

Thanks for such a detailed breakdown!

Angelina 2010-10-27 08:14:52

Answer 7

A:

Like it's already mentioned, I would highly recommend using regular expression (in System.Text) to get this kind of job done.

In combo with a solid tool like RegexBuddy, you are looking at handling any complex text record parsing situations, as well as getting results quickly. The tool makes it real easy.

Hope that helps.

Vin 2009-05-13 16:15:05

Answer 8

A:

One way that I've found really useful in situations like this is to go old-school and use the Jet OLEDB provider, together with a schema.ini file to read large tab-delimited files in using ADO.Net. Obviously, this method is really only useful if you know the format of the file to be imported.

public void ImportCsvFile(string filename)

{ FileInfo file = new FileInfo(filename);

using (OleDbConnection con = 
        new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" +
        file.DirectoryName + "\";
        Extended Properties='text;HDR=Yes;FMT=TabDelimited';"))
{
    using (OleDbCommand cmd = new OleDbCommand(string.Format
                              ("SELECT * FROM [{0}]", file.Name), con))
    {
        con.Open();

        // Using a DataReader to process the data
        using (OleDbDataReader reader = cmd.ExecuteReader())
        {
            while (reader.Read())
            {
                // Process the current reader entry...
            }
        }

        // Using a DataTable to process the data
        using (OleDbDataAdapter adp = new OleDbDataAdapter(cmd))
        {
            DataTable tbl = new DataTable("MyTable");
            adp.Fill(tbl);

            foreach (DataRow row in tbl.Rows)
            {
                // Process the current row...
            }
        }
    }
}

}

Once you have the data in a nice format like a datatable, filtering out the data you need becomes pretty trivial.

Mark Green 2009-05-13 16:28:31

ansaurus

tags:

views:

answers:

How to parse a text file with C#

related questions