views:

244

answers:

4

I'm trying to create an "snippet" from a paragraph. I have a long paragraph of text with a word hilighted in the middle. I want to get the line containing the word before that line and the line after that line.

I have the following piece of information:

  • The text (in a string)
  • The lines are deliminated by a NEWLINE character \n
  • I have the index into the string of the text I want to hilight

A couple other criteria:

  • If my word falls on first line of the paragraph, it should show the 1st 3 lines
  • If my word falls on the last line of the paragraph, it should show the last 3 lines
  • Should show the entire paragraph in the degenative cases (the paragraph only has 1 or 2 lines)

Here's an example:

This is the 1st line of CAT text in the paragraph
This is the 2nd line of BIRD text in the paragraph
This is the 3rd line of MOUSE text in the paragraph
This is the 4th line of DOG text in the paragraph
This is the 5th line of RABBIT text in the paragraph

Example, if my index points to BIRD, it should show lines 1, 2, & 3 as one complete string like this:

This is the 1st line of CAT text in the paragraph
This is the 2nd line of BIRD text in the paragraph
This is the 3rd line of MOUSE text in the paragraph

If my index points to DOG, it should show lines 3, 4, & 5 as one complete string like this:

This is the 3rd line of MOUSE text in the paragraph
This is the 4th line of DOG text in the paragraph
This is the 5th line of RABBIT text in the paragraph

etc.

Anybody want to help tackle this?

A: 

There are a few ways one can handle this:

First Method: Use String.IndexOf() and String.LastIndexOf().

You can find where the current selected word is by using TextBox.SelectionStart(). Then simply look for LastIndexOf from the selection location looking for the '\n' to find the previous line (don't grab the first lastindexof from the selection, once you find one...do it again from that location so you get the beginning of that line). Then do the same from the selection point only using IndexOf to find the '\n' to get the end of the line. Once again, don't use the first one you find, repeat it starting from the first found location to get the second line's end. Then simply substring the text with the area you found.

Second Method: Use String.Split() by the '\n' character (creates an array of strings, each one containing a different line from the text in order of array index). Find the index of the line the text is in, and then simply grab from the String[index] for the line before, including, and after. Hopefully this two methods are clear enough for you to figure out your coding. If you are still stuck, let me know.

NebuSoft
A: 

Alright. Lemme have a crack,

I think the first thing I would do is split everything into arrays. Simply because then we have a simple way to "count" the lines.

string[] lines = fullstring.Split('\n');

Once we have that, Unfortunately I don't know of any indexof that goes through each point in an array. There probably is one, but without trawling through the internet, I would simply go

int i = -1;
string animal = 'bird';

foreach(string line in lines)
{
i++;
if(line.indexof(animal) > -1) break;

}
// we will need a if(i == -1) then we didn't find the animal etc

Ok so then, We now have the line. All we need to do, is...

if(i == 0)
{
writeln(lines[0);
writeln(lines[1]);
etc
}
else
if(i == lines.count - 1)
{
//this means last array index
}
else
{
//else we are in the middle. So just write out the i -1, i, i+1
}

I know that is messy as hell. But that's how I would solve the issue.

Pyronaut
+2  A: 

Using the LINQ extension methods to get the right strings:

string[] lines = text.Split('\n');

// Find the right line to work with
int position = 0;
for (int i = 0; i < lines.Count(); i++)
  if (lines[i].Contains(args[0]))
    position = i - 1;

// Get in range if we had a match in the first line
if (position == -1)
  position = 0;

// Adjust the line index so we have 3 lines to work with
if (position > lines.Count() - 3)
  position = lines.Count() - 3;

string result = String.Join("\n", lines.Skip(position).Take(3).ToArray());

This can of course be optimized a bit by quitting the for loop as soon as the index has been found, and probably a number of other things. You can probably even LINQify so you never need to actually store that extra array, but I can't think of a good way to do that right now.

An alternative for the checks on position could be something like position = Math.Max(0,Math.Min(position, lines.Count() - 3)); - which would handle both of them at once.

Michael Madsen
This isn't precisely the answer I was looking for, but it's close enough that it got me in the right direction. I especially like this lines.Skip().Take(). That's great!
Keltex
+3  A: 

In my opinion this is an excellent opportunity to use the StringReader class:

  1. Read your text line by line.
  2. Keep your lines in some kind of buffer (e.g., a Queue<string>), dropping lines you don't need after a given number of lines have been read.
  3. Once your "needle" is found, read one more line (if possible) and then just return what's in your buffer.

In my opinion, this has some advantages over the other approaches suggested:

  1. Since it doesn't utilize String.Split, it doesn't do more work than you need -- i.e., reading the entire string looking for the characters to split on, and creating an array of the substrings.
  2. In fact, it doesn't necessarily read the entire string at all, since once it finds the text it's looking for it only goes as far as necessary to get the desired number of padding lines.
  3. It could even be refactored (very easily) to be able to deal with any textual input via a TextReader -- e.g., a StreamReader -- so it could even work with huge files, without having to load the entire contents of a given file into memory.

Imagine this scenario: you want to find an excerpt of text from a text file that contains the entire text from a novel. (Not that this is your scenario -- I'm just speaking hypothetically.) Using String.Split would require that the entire text of the novel be split according to the delimiter you specified, whereas using a StringReader (well, in this case, a StreamReader) would only require reading until the desired text was found, at which point the excerpt would be returned.

Again, I realize this isn't necessarily your scenario -- just suggesting that this approach provides scalability as one of its strengths.


Here's a quick implementation:

// rearranged code to avoid horizontal scrolling
public static string FindSurroundingLines
(string haystack, string needle, int paddingLines) {

    if (string.IsNullOrEmpty(haystack))
        throw new ArgumentException("haystack");
    else if (string.IsNullOrEmpty(needle))
        throw new ArgumentException("needle");
    else if (paddingLines < 0)
        throw new ArgumentOutOfRangeException("paddingLines");

    // buffer needs to accomodate paddingLines on each side
    // plus line containing the needle itself, so:
    // (paddingLines * 2) + 1
    int bufferSize = (paddingLines * 2) + 1;

    var buffer = new Queue<string>(/*capacity*/ bufferSize);

    using (var reader = new StringReader(haystack)) {
        bool needleFound = false;

        while (!needleFound && reader.Peek() != -1) {
            string line = reader.ReadLine();

            if (buffer.Count == bufferSize)
                buffer.Dequeue();

            buffer.Enqueue(line);

            needleFound = line.Contains(needle);
        }

        // at this point either the needle has been found,
        // or we've reached the end of the text (haystack);
        // all that's left to do is make sure the string returned
        // includes the specified number of padding lines
        // on either side
        int endingLinesRead = 0;
        while (
            (reader.Peek() != -1 && endingLinesRead++ < paddingLines) ||
            (buffer.Count < bufferSize)
        ) {
            if (buffer.Count == bufferSize)
                buffer.Dequeue();

            buffer.Enqueue(reader.ReadLine());
        }

        var resultBuilder = new StringBuilder();
        while (buffer.Count > 0)
            resultBuilder.AppendLine(buffer.Dequeue());

        return resultBuilder.ToString();
    }
}

Some example input/output (with text containing your example input):

Code:

Console.WriteLine(FindSurroundingLines(text, "MOUSE", 1);

Output:

This is the 2nd line of BIRD text in the paragraph
This is the 3rd line of MOUSE text in the paragraph
This is the 4th line of DOG text in the paragraph

Code:

Console.WriteLine(FindSurroundingLines(text, "BIRD", 1);

Output:

This is the 1st line of CAT text in the paragraph
This is the 2nd line of BIRD text in the paragraph
This is the 3rd line of MOUSE text in the paragraph

Code:

Console.WriteLine(FindSurroundingLines(text, "DOG", 0);

Output:

This is the 4th line of DOG text in the paragraph

Code:

 Console.WriteLine(FindSurroundingLines(text, "This", 2);

Output:

This is the 1st line of CAT text in the paragraph
This is the 2nd line of BIRD text in the paragraph
This is the 3rd line of MOUSE text in the paragraph
This is the 4th line of DOG text in the paragraph
This is the 5th line of RABBIT text in the paragraph
Dan Tao