tags:

views:

76

answers:

1

hey guys i have a textfile i have divided it into 4 parts. i want to search each part for the words that appear in each part and score that word

exmaple

welcome to the national basketball finals,the basketball teams here today have come a long way. without much delay lets play basketball.

i will want to return national = 1 as it appears only in one part etc

am working on determining text context using word position.

am working with c# and not very good in text processing basically if a word appears in the 4 sections it scores 4 if a word appears in the 3 sections it scores 3 if a word appears in the 2 sections it scores 2 if a word appears in the 1 section it scores 1

thanks in advance

so far i have this

var s = "welcome to the national basketball finals,the basketball teams here today have come a long way. without much delay lets play basketball. ";

    var numberOfParts = 4;

    var eachPartLength = s.Length / numberOfParts;

    var parts = new List<string>();

    var words = Regex.Split(s, @"\W").Where(w => w.Length > 0); // this splits all words, removes empty strings

    var wordsIndex = 0;

    for (int i = 0; i < numberOfParts; i++)
    {

        var sb = new StringBuilder();

        while (sb.Length < eachPartLength && wordsIndex < words.Count())
        {

            sb.AppendFormat("{0} ", words.ElementAt(wordsIndex));

            wordsIndex++;

        }


        // here you have the part

        Response.Write("[{0}]"+ sb);

        parts.Add(sb.ToString());

        var allwords = parts.SelectMany(p => p.Split(' ').Distinct());

       var wordsInAllParts = allwords.Where(w => parts.All(p => p.Contains(w))).Distinct();
+2  A: 

This question is very difficult to interpret. I don't fully understand your goal and it is my suspicion that you might not either.

In the absence of a clear requirement, there is no way to give a specific answer, so I will give a generic one:

Try writing a test that clearly specifies the exact behavior you want. You've got the beginnings of one with your sample string and the result you want but it's not unambiguous what you are looking for.

Make a test that, when it passes, demonstrates that one of the required behaviors is there. If that doesn't help you get a solution to the problem, come back and edit this question or make a new one that includes the test.

At the very least, you will be able to harvest better answers from this site.

@ MaxGuernseyIII and Ahmad Mageed THANKs alot guys ur comments got me thinking and thinking well i solved my problem. I added the parts arraylist then used arraylist[i].ToString().IndexOf(string);to access each arraylist index
ryder1211212
@ryder1211212: I'm glad your problem got solved.