tags:

views:

121

answers:

5
abc  = tamaz feeo maa roo key gaera porla
Xyz = gippaza eka jaguar ammaz te sanna.

i want to make a struct

public struct word
{
 public string Word;
 public string Definition;
}

how i can parse them and make a list of <word> in c#.

how i can parse it in c#

thanks for help but it is a text and it is not sure that a line or more so what i do for newline

+4  A: 

Read the input line by line and split by the equal sign.

class Entry
{
    private string term;
    private string definition;

    Entry(string term, string definition)
    {
        this.term = term;
        this.definition = definition;
    }
}

// ...

string[] data = line.Split('=');
string word = data[0].Trim();
string definition = data[1].Trim();

Entry entry = new Entry(word, definition);
thelost
Just add .Trim() to get rid any extra whitespace
Sruly
@Sruly good point, thanks!
thelost
A: 

Use Regular Expressions

tom3k
I'm not down rating your answer but I think this would be a bit over the top.
John
+1  A: 
// Split at an = sign. Take at most two parts (word and definition); 
//    ignore any = signs in the definition
string[] parts = line.Split(new[] { '=' }, 2);

word w = new word();
w.Word = parts[0].Trim();

// If the definition is missing then parts.Length == 1
if (parts.Length == 1)
    w.Definition = string.Empty;
else
    w.Definition = parts[1].Trim();

words.Add(w);
Tim Robinson
No need for array. `String.Split` also accepts single `char` as param
abatishchev
Only the `params char[]` overload accepts a single `char`. The other overloads (such as the `char[], int` I'm using above) need an explicit array.
Tim Robinson
Hi, not sure why this was down voted?
Tim Robinson
+2  A: 

This can also be done using a very simple LINQ query:

var definitions =
    from line in File.ReadAllLines(file)
    let parts = line.Split('=')
    select new word
        {
            Word = parts[0].Trim(),
            Definition = parts[1].Trim()
        }
Ronald Wildenberg
A really concise solution, up (and do you think the regexp i propose it is well?)
Marcello Faga
+1  A: 

Using RegExp you can proceed in two ways, depending on your source input


Exemple 1

Assuming you have read your source and saved any single line in a vector or list :

string[] input = { "abc  = tamaz feeo maa roo key gaera porla", "Xyz = gippaza eka jaguar ammaz te sanna." };

 Regex mySplit = new Regex("(\\w+)\\s*=\\s*((\\w+).*)");

 List<word> mylist = new List<word>();

 foreach (string wordDef in input)
 {
      Match myMatch = mySplit.Match(wordDef);

      word myWord;

      myWord.Word = myMatch.Groups[1].Captures[0].Value;
      myWord.Definition = myMatch.Groups[2].Captures[0].Value;

       mylist.Add(myWord);
 }

Exemple 2

Assuming you have read your source in a single variable (and any line is terminated with the line break character '\n') you can use the same regexp "(\w+)\s*=\s*((\w+).*)" but in this way

string inputs = "abc  = tamaz feeo maa roo, key gaera porla\r\nXyz = gippaza eka jaguar; ammaz: te sanna.";

MatchCollection myMatches = mySplit.Matches(inputs);

foreach (Match singleMatch in myMatches)
{

    word myWord;

    myWord.Word = singleMatch.Groups[1].Captures[0].Value;
    myWord.Definition = singleMatch.Groups[2].Captures[0].Value;

    mylist.Add(myWord);
}

Lines that matches or does not match the regexp "(\w+)\s=\s*((\w+).)":

  • "abc = tamaz feeo maa roo key gaera porla,qsdsdsqdqsd\n" --> Match!
  • "Xyz= gippaza eka jaguar ammaz te sanna. sdq=sqds \n" --> Match! you can insert description that includes spaces too.
  • "qsdqsd=\nsdsdsd\n" --> Match a multiline pair too!
  • "sdqsd=\n" --> DO NOT Match! (lacking descr)
  • "= sdq sqdqsd.\n" --> DO NOT Match! (lacking word)
Marcello Faga
I would use an @ to inhibit backslash expansion, making it easier to read the regular expression this wayRegex mySplit = new Regex(@"(\w+)\s*=\s*((\w+).*)");
mico
You'right Mico, i screwed up again, but i wrote those lines in a hurry
Marcello Faga