tags:

views:

5567

answers:

8

Hello,

By text formatting I meant something more complicated.

At first I began manually adding the 5000 lines from the text file I'm asking this question for,into my project.

The text file has 5000 lines with different length.For example:

1   1 ITEM_ETC_GOLD_01 골드(소) xxx xxx xxx_TT_DESC 0 0 3 3 5 0 180000 3 0 1 0 0 255 1 1 0 0 0 0 0 0 0 0 0 0 -1 0 -1 0 -1 0 -1 0 -1 0 0 0 0 0 0 0 100 0 0 0 xxx item\etc\drop_ch_money_small.bsr xxx xxx xxx 0 2 0 0 1 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 표현할 골드의 양(param1이상) -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx 0 0

1   4 ITEM_ETC_HP_POTION_01 HP 회복 약초 xxx SN_ITEM_ETC_HP_POTION_01 SN_ITEM_ETC_HP_POTION_01_TT_DESC 0 0 3 3 1 1 180000 3 0 1 1 1 255 3 1 0 0 1 0 60 0 0 0 1 21 -1 0 -1 0 -1 0 -1 0 -1 0 0 0 0 0 0 0 100 0 0 0 xxx item\etc\drop_ch_bag.bsr item\etc\hp_potion_01.ddj xxx xxx 50 2 0 0 1 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 120 HP회복양 0 HP회복양(%) 0 MP회복양 0 MP회복양(%) -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx 0 0

1   5 ITEM_ETC_HP_POTION_02 HP 회복약 (소) xxx SN_ITEM_ETC_HP_POTION_02 SN_ITEM_ETC_HP_POTION_02_TT_DESC 0 0 3 3 1 1 180000 3 0 1 1 1 255 3 1 0 0 1 0 110 0 0 0 2 39 -1 0 -1 0 -1 0 -1 0 -1 0 0 0 0 0 0 0 100 0 0 0 xxx item\etc\drop_ch_bag.bsr item\etc\hp_potion_02.ddj xxx xxx 50 2 0 0 2 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 220 HP회복양 0 HP회복양(%) 0 MP회복양 0 MP회복양(%) -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx -1 xxx 0 0

The text between the first character(1) and the second character(1/4/5) is not a whitespace,it's a tab.There's no whitespaces in that text file.

What I want:

I want to get the second integer(In the three lines I posted above,the second integers are 1,4 and 5) and the string in the middle of each line indicating the path(It starts with "item\" and ends with the file extension ".ddj").

My problem:

When I google "Text formatting C#" - all I get is how to open a text file and how to write a text file in C#.I don't know how to search for text inside a text file.Also I can't search for the first integer,because in case its a small integer like in the three lines I posted above,I wont be able to find the corrent location,because for example "1" might exist in a different location.

My question:

It would be the best If I write a program that would delete anything,but what I need.

The other way in my mind is to directly search inside that file,but as I mentioned above - I might get the wrong location of the second integer if its too low.

Please suggest something,I can't format all this by hand.

+5  A: 

You could do something like:

using (TextReader rdr = OpenYourFile()) {
    string line;
    while ((line = rdr.ReadLine()) != null) {
        string[] fields = line.Split('\t'); // THIS LINE DOES THE MAGIC
        int theInt = Convert.ToInt32(fields[1]);
    }
}

The reason you didn't find relevant result when searching for 'formatting' is that the operation you are performing is called 'parsing'.

erikkallen
There is no , in his input data
Binary Worrier
Down-vote removed :)
Binary Worrier
This doesn't get "the string in the middle of each line indicating the path" (taken directly from the question).
Samir Talwar
Alright,very useful,but How do I find the string?
John
You may need to use line.Split("\t".ToCharArray()) depending on your version (IIRC)Take care, though. If you want to access the 15th item on the line, but the line you are working on only contains 12 items (for example) you will get an exception. Guard against this kind of thing as much as possible. Also, an empty line will throw you into disarray (no pun intended) as the line.split("\t") command will return an array with only a single, empty element.
ZombieSheep
A: 

Try regular expressions. You can find a certain pattern in your text and replace it with something that you want. I can't give you the exact code right now but you can test out your expressions using this.

http://www.radsoftware.com.au/regexdesigner/

Marc Vitalis
+5  A: 

OK, here's what we do: open the file, read it line by line, and split it by tabs. Then we grab the second integer and loop through the rest to find the path.

StreamReader reader = FileInfo.OpenText("filename.txt");
string line;
while ((line = reader.ReadLine()) != null) {
    string items[] = line.Split('\t');
    int myInteger = int.Parse(items[1]); // Here's your integer.
    // Now let's find the path.
    string path = null;
    foreach (string item in items) {
        if (item.StartsWith("item\\") && item.EndsWith(".ddj")) {
            path = item;
        }
    }

    // At this point, `myInteger` and `path` contain the values we want
    // for the current line. We can then store those values or print them,
    // or anything else we like.
}
Samir Talwar
Thanks,I'll test it and then give feedback!
John
Works Great,thanks!
John
Great. I don't have a C# compiler on this machine so I had to wing it. Glad to hear it works out of the box.
Samir Talwar
A: 

You could open the file up and use StreamReader.ReadLine to read the file in line-by-line. Then you can use String.Split to break each line into pieces (use a \t delimiter) to extract the second number.

As the number of items is different you would need to search the string for the pattern 'item\*.ddj'.

To delete an item you could (for example) keep all of the file's contents in memory and write out a new file when the user clicks 'Save'.

Justin Ethier
A: 

What you want to do is write a program that will parse the file, taking out the parts you want and formatting the output in such a manner that you can easily paste it into your code (or alternatively, have your code load the file each time it runs).

If you take one of the "read a file" samples, read in each line

string line = reader.ReadLine()

string[] fields = line.split("\t"); // will give an array of strings

fields[1] & fields[0] have the data you want.

Binary Worrier
+4  A: 

Another solution, this time making use of regular expressions:

using System.Text.RegularExpressions;

...

StreamReader reader = FileInfo.OpenText("filename.txt");
string line;
while ((line = reader.ReadLine()) != null) {
    Match m = Regex.Match(@"^\d+\t(\d+)\t.+?\t(item\\[^\t]+\.ddj)");
    if (m.Success) {
        int myInt = int.Parse(m.Group(1).Value);
        string path = m.Group(2).Value;

        // At this point, `myInteger` and `path` contain the values we want
        // for the current line. We can then store those values or print them,
        // or anything else we like.
    }
}

That expression's a little complex, so here it is broken down:

^        Start of string
\d+      "\d" means "digit" - 0-9. The "+" means "one or more."
         So this means "one or more digits."
\t       This matches a tab.
(\d+)    This also matches one or more digits. This time, though, we capture it
         using brackets. This means we can access it using the Group method.
\t       Another tab.
.+?      "." means "anything." So "one or more of anything". In addition, it's lazy.
         This is to stop it grabbing everything in sight - it'll only grab as much
         as it needs to for the regex to work.
\t       Another tab.

(item\\[^\t]+\.ddj)
    Here's the meat. This matches: "item\<one or more of anything but a tab>.ddj"
Samir Talwar
I dont know which of your answers to accept,both are working great.I like this one more,because you explained why and I had never seen that before!
John
If you like regular expressions, I'd recommend using something like Perl next time you want to process files like this. It's designed around them, and you can use it to easily format your file in a way you like.
Samir Talwar
Samir Talwar: I reckon you should become a teach regular expressions. The way you explained everything was just brilliant. I've never had a teacher that's been so detailed! +1
lucifer
j-t-s: Cheers for the compliment, mate. I do try. :-)
Samir Talwar
Thanks for such a detailed breakdown!
Angelina
A: 

Like it's already mentioned, I would highly recommend using regular expression (in System.Text) to get this kind of job done.

In combo with a solid tool like RegexBuddy, you are looking at handling any complex text record parsing situations, as well as getting results quickly. The tool makes it real easy.

Hope that helps.

Vin
A: 

One way that I've found really useful in situations like this is to go old-school and use the Jet OLEDB provider, together with a schema.ini file to read large tab-delimited files in using ADO.Net. Obviously, this method is really only useful if you know the format of the file to be imported.

public void ImportCsvFile(string filename)

{ FileInfo file = new FileInfo(filename);

using (OleDbConnection con = 
        new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" +
        file.DirectoryName + "\";
        Extended Properties='text;HDR=Yes;FMT=TabDelimited';"))
{
    using (OleDbCommand cmd = new OleDbCommand(string.Format
                              ("SELECT * FROM [{0}]", file.Name), con))
    {
        con.Open();

        // Using a DataReader to process the data
        using (OleDbDataReader reader = cmd.ExecuteReader())
        {
            while (reader.Read())
            {
                // Process the current reader entry...
            }
        }

        // Using a DataTable to process the data
        using (OleDbDataAdapter adp = new OleDbDataAdapter(cmd))
        {
            DataTable tbl = new DataTable("MyTable");
            adp.Fill(tbl);

            foreach (DataRow row in tbl.Rows)
            {
                // Process the current row...
            }
        }
    }
}

}

Once you have the data in a nice format like a datatable, filtering out the data you need becomes pretty trivial.

Mark Green