tags:

views:

186

answers:

5

here is my code

// Import io so we can use file objects
import java.io.*;

public class SearchThe {
    public static void main(String args[]) {
        try {
            String stringSearch = "the";
            // Open the file c:\test.txt as a buffered reader
            BufferedReader bf = new BufferedReader(new FileReader("test.txt"));

            // Start a line count and declare a string to hold our current line.
            int linecount = 0;
                String line;

            // Let the user know what we are searching for
            System.out.println("Searching for " + stringSearch + " in file...");

            // Loop through each line, stashing the line into our line variable.
            while (( line = bf.readLine()) != null){
                // Increment the count and find the index of the word
                linecount++;
                int indexfound = line.indexOf(stringSearch);

                // If greater than -1, means we found the word
                if (indexfound > -1) {
                    System.out.println("Word was found at position " + indexfound + " on line " + linecount);
                }
            }

            // Close the file after done searching
            bf.close();
        }
        catch (IOException e) {
            System.out.println("IO Error Occurred: " + e.toString());
        }
    }
}

i want to find some word "the" in test.txt file. the problem is when i foune first "the", my program is stop find.

and when some word like "then" my progran understand it is a word "the"

how i fix it?

thank you

+3  A: 

You shouldn't use indexOf because it will find all the possible substring that you have in your string. And because "then" contains the string "the", so it is also a good substring.

More about indexOf

indexOf

public int indexOf(String str, int fromIndex) Returns the index within this string of the first occurrence of the specified substring, starting at the specified index. The integer returned is the smallest value k for which:

You should separate the lines into many words and loop over each word and compare to "the".

String [] words = line.split(" ");
for (String word : words) {
  if (word.equals("the")) {
    System.out.println("Found the word");
  }
}

The above code snippet will also loop over all possible "the" in the line for you. Using indexOf will always returns you the first occurrence

vodkhang
This is not an answer. It's a criticism.
Asaph
First, I just tried to find the problem he had, and the indexOf method is the problem. Then, I find another, good way to do what he want. Anything wrong?
vodkhang
Yes - you're point baiting. Write a complete answer before posting.
Michael Shimmins
yeah, sorry for that :)
vodkhang
Splitting on `" "` will not find instances of "the." or "The". You could do many `equals(...)` and `indexOf(...)` tests, regexes are much more flexible for this.
Chadwick
ah yeah, it is possible for special cases like that. For commercial, I think regrex should be the best, but I am not sure if this homework requires that much
vodkhang
A: 

You best should use Regular Expressions for this kind of search. As a easy/dirty workaround you could modify your stringSearch from

String stringSearch = "the";

to

String stringSearch = " the ";
flash
Doesn't accommodate for the end or the start of the line
Michael Shimmins
This will not work if "the" is at the beginning of the line, end of the line, just before a special character, or uppercase.
Thomas Mueller
A: 

Your current implementation will only find the first instance of 'the' per line.

Consider splitting each line into words, iterating over the list of words, and comparing each word to 'the' instead:

while (( line = bf.readLine()) != null)
{
    linecount++;
    String[] words = line.split(" ");

    for (String word : words)
    {
        if(word.equals(stringSearch))
            System.out.println("Word was found at position " + indexfound + " on line " + linecount);
    }
}
Michael Shimmins
+7  A: 

Use Regexes case insensitively, with word boundaries to find all instances and variations of "the".

indexOf("the") can not discern between "the" and "then" since each starts with "the". Likewise, "the" is found in the middle of "anathema".

To avoid this, use regexes, and search for "the", with word boundaries (\b) on either side. Use word boundaries, instead of splitting on " ", or using just indexOf(" the ") (spaces on either side) which would not find "the." and other instances next to punctuation. You can also do your search case insensitively to find "The" as well.

Pattern p = Pattern.compile("\\bthe\\b", Pattern.CASE_INSENSITIVE);

while ( (line = bf.readLine()) != null) {
    linecount++;

    Matcher m = p.matcher(line);

    // indicate all matches on the line
    while (m.find()) {
        System.out.println("Word was found at position " + 
                       m.start() + " on line " + linecount);
    }
}
Chadwick
+1 for regex use, much better than the other 'split' options (including mine).
Michael Shimmins
A: 

It doesn't sound like the point of the exercise is to skill you up in regular expressions (I don't know it may be... but it seems a little basic for that), even though regexs would indeed be the real-world solution to things like this.

My advice is to focus on the basics, use index of and substring to test the string. Think about how you could account for the naturally case sensitive nature of strings. Also, does your reader always get closed (i.e. is there a way bf.close() wouldn't be executed)?

CurtainDog