views:

548

answers:

3

I'm reading a file using bufferedreader, so lets say i have

line = br.readLine();

I want to check if this line contains one of many possible strings (which i have in an array). I would like to be able to write something like:

while (!line.matches(stringArray) { // not sure how to write this conditional
  do something here;
  br.readLine();
}

I'm fairly new to programming and Java, am I going about this the right way?

+2  A: 

Copy all values into a Set<String> and then use contains():

Set<String> set = new HashSet<String> (Arrays.asList (stringArray));
while (!set.contains(line)) { ... }

[EDIT] If you want to find out if a part of the line contains a string from the set, you have to loop over the set. Replace set.contains(line) with a call to:

public boolean matches(Set<String> set, String line) {
    for (String check: set) {
        if (line.contains(check)) return true;
    }
    return false;
}

Adjust the check accordingly when you use regexp or a more complex method for matching.

[EDIT2] A third option is to concatenate the elements in the array in a huge regexp with |:

Pattern p = Pattern.compile("str1|str2|str3");

while (!p.matcher(line).find()) { // or matches for a whole-string match
    ...
}

This can be more cheap if you have many elements in the array since the regexp code will optimize the matching process.

Aaron Digulla
Correct me if im wrong please, but doesnt this check to see if the array contains the line being read from the file, instead of checking if the line contains one of the strings from the array?
karunga
You're absolutely right; see my edits for better solutions.
Aaron Digulla
Regex is then dependent on the contents of the String array. Could be considered a security flaw, depending on what that condition is you are checking.
Daniel Schneller
@Daniel: There is no information where that data comes from, so I can't comment on that.
Aaron Digulla
Thanks guys for your input. I don't understand some of the syntax so I've got a bit of reading to do.
karunga
You could escape the literal strings with `Pattern.quote()` first.
wds
+1  A: 

It depends on what stringArray is. If it's a Collection then fine. If it's a true array, you should make it a Collection. The Collection interface has a method called contains() that will determine if a given Object is in the Collection.

Simple way to turn an array into a Collection:

String tokens[] = { ... }
List<String> list = Arrays.asList(tokens);

The problem with a List is that lookup is expensive (technically linear or O(n)). A better bet is to use a Set, which is unordered but has near-constant (O(1)) lookup. You can construct one like this:

From a Collection:

Set<String> set = new HashSet<String>(stringList);

From an array:

Set<String> set = new HashSet<String>(Arrays.asList(stringArray));

and then set.contains(line) will be a cheap operation.

Edit: Ok, I think your question wasn't clear. You want to see if the line contains any of the words in the array. What you want then is something like this:

BufferedReader in = null;
Set<String> words = ... // construct this as per above
try {
  in = ...
  while ((String line = in.readLine()) != null) {
    for (String word : words) {
      if (line.contains(word)) [
        // do whatever
      }
    }
  }
} catch (Exception e) {
  e.printStackTrace();
} finally {
  if (in != null) { try { in.close(); } catch (Exception e) { } }
}

This is quite a crude check, which is used surprisingly open and tends to give annoying false positives on words like "scrap". For a more sophisticated solution you probably have to use regular expression and look for word boundaries:

Pattern p = Pattern.compile("(?<=\\b)" + word + "(?=\b)");
Matcher m = p.matcher(line);
if (m.find() {
  // word found
}

You will probably want to do this more efficiently (like not compiling the pattern with every line) but that's the basic tool to use.

cletus
Mind the security implications of the regex approach - depending on where the strings in the array come from, this might disrupt the regex and allow injection of arbitrary conditions.
Daniel Schneller
A: 

Using the String.matches(regex) function, what about creating a regular expression that matches any one of the strings in the string array? Something like

String regex = "*(";
for(int i; i < array.length-1; ++i)
  regex += array[i] + "|";
regex += array[array.length] + ")*";
while( line.matches(regex) )
{
  //. . . 
}
Adrian Park
-1 This is error prone. Depending on the contents of the String array your regular expression changes. Could even be considered a security flaw.
Daniel Schneller
The array was never stated to be a static array. How do you suggest implementing a regular expression that matches a dynamic array, that "does not change"?
Adrian Park