views:

192

answers:

4

I have a block of text that im taking from a Gedcom (Here and Here) File

The text is flat and basically broken into "nodes"

I am splitting each node on the \r char and thus subdividing it into each of its parts( amount of "lines" can vary)

I know the 0 address will always be the ID but after that everything can be anywhere so i want to test each Cell of the array to see if it contains the correct tag for me to proccess

an example of what two nodes would look like

0 @ind23815@ INDI <<<<<<<<<<<<<<<<<<< Start of node 1
1 NAME Lawrence /Hucstepe/
2 DISPLAY Lawrence Hucstepe
2 GIVN Lawrence
2 SURN Hucstepe
1 POSITION -850,-210
2 BOUNDARY_RECT (-887,-177),(-813,-257)
1 SEX M
1 BIRT 
2 DATE 1521
1 DEAT Y
2 DATE 1559
1 NOTE     * Born: Abt 1521, Kent, England
2 CONT     * Marriage: Jane Pope 17 Aug 1546, Kent, England
2 CONT     * Died: Bef 1559, Kent, England
2 CONT 
1 FAMS @fam08318@
0 @ind23816@ INDI  <<<<<<<<<<<<<<<<<<<<<<< Start of Node 2
1 NAME Jane /Pope/
2 DISPLAY Jane Pope
2 GIVN Jane
2 SURN Pope
1 POSITION -750,-210
2 BOUNDARY_RECT (-787,-177),(-713,-257)
1 SEX F
1 BIRT 
2 DATE 1525
1 DEAT Y
2 DATE 1609
1 NOTE     * Born: Abt 1525, Tenterden, Kent, England
2 CONT     * Marriage: Lawrence Hucstepe 17 Aug 1546, Kent, England
2 CONT     * Died: 23 Oct 1609
2 CONT 
1 FAMS @fam08318@
0 @ind23817@ INDI  <<<<<<<<<<< start of Node 3

So a when im done i have an array that looks like

address , string
0 = "1 NAME Lawrence /Hucstepe/"
1 = "2 DISPLAY Lawrence Hucstepe"
2 = "2 GIVN Lawrence"
3 = "2 SURN Hucstepe"
4 = "1 POSITION -850,-210"
5 = "2 BOUNDARY_RECT (-887,-177),(-813,-257)"
6 = "1 SEX M"
7 = "1 BIRT "
8 = "1 FAMS @fam08318@"

So my question is what is the best way to search the above array to see which Cell has the SEX tag or the NAME Tag or the FAMS Tag

this is the code i have

private int FindIndexinArray(string[] Arr, string search)
{
    int Val = -1;
    for (int i = 0; i < Arr.Length; i++)
    {
        if (Arr[i].Contains(search))
        {
            Val = i;
        }
    }
    return Val;
}

But it seems inefficient because i end up calling it twice to make sure it doesnt return a -1

Like so

            if (FindIndexinArray(SubNode, "1 BIRT ") != -1)
            {
                // add birthday to Struct 
                I.BirthDay = SubNode[FindIndexinArray(SubNode, "1 BIRT ") + 1].Replace("2 DATE ", "").Trim();
            }

sorry this is a longer post but hopefully you guys will have some expert advice

+3  A: 

What about a simple regular expression?

^(\d)\s=\s\"\d\s(SEX|BIRT|FAMS){1}.*$

First group captures the address, second group the tag.

Also, it might be quicker to dump all array items into a string and do your regex on the whole lot at once.

Si
I tried this in a way already by searching line by line but the issue i would run into is that if youll look at the block of info ( i didn't hit on this in my above post) how each line has a number. that means that it refers to the number above it so 0 name 1birth 2 date means that the date of the birth for that person is on line 2 but then when you get into mulitple tags it got complicated quick . I appreshate the help though ill see if i can work it in.
Crash893
I meant applying it on the array 0 = ..., 1 = .... not the whole file. You could probably use regex to parse the whole file quite easily.Anyway, FindAll will also work so glad you got it sorted, but quick question: what happens if the actual data contains your tag? ;-)
Si
basicly what if someones name is BIRT or last name is FAMSwell then i might be screwed
Crash893
And that's where the regex would work where Contains would fail ;-)
Si
+2  A: 

Can use the static method FindAll of the Array class: It will return the string itself though, if that works..

string[] test = { "Sex", "Love", "Rock and Roll", "Drugs", "Computer"};
Array.FindAll(test, item => item.Contains("Sex") || item.Contains("Drugs") || item.Contains("Computer"));

The => indicates a lamda expression. Basically a method without a concrete implementation. You can also do this if the lamda gives you the creeps.

//Declare a method 

     private bool HasTag(string s)
     {
         return s.Contains("Sex") || s.Contains("Drugs") || s.Contains("Computer");
     }

     string[] test = { "Sex", "Love", "Rock and Roll", "Drugs", "Computer"};
     Array.FindAll(test, HasTag);
Pat
that just might do it. could you explain the => syntax for me im not familiar with that one.
Crash893
A: 

"But it seems inefficient because i end up calling it twice to make sure it doesnt return a -1"

Copy the returned value to a variable before you test to prevent multiple calls.

IndexResults = FindIndexinArray(SubNode, "1 BIRT ")
if (IndexResults != -1)
        {
            // add birthday to Struct 
            I.BirthDay = SubNode[IndexResults].Replace("2 DATE ", "").Trim();
        }
A: 

The for loop in method FindIndexinArray shd break once you find a match if you are interested in only the first match.