tags:

views:

51

answers:

2

OK, I've got a long list (about 43,800 lines) of entries and I want to select a subset of this list. The list is in a database and the program that reads the database can use one or more regular expressions.

Is it better to use multiple (possibly up to 100 or so) regular expressions in one call to the database program to essentially select the entries I want

OR

would it be better to use a much smaller number of regular expressions (up to about 10) to select many more entries into an array and then use a for loop and some simple if statements to filter out the data I'm not interested in?

I've read or heard that it's best to avoid overusing regular expressions as they are computationally expensive?

+2  A: 

I would recommend the second approach, especially if you can find a way to fork the regex searches into different threads and get some concurrency going. Divide and conquer, baby!

KevDog
+2  A: 

KevDog's answer seems like the right way forward.

I just wanted to add that you might want to try both and measure the performance difference. That way, you'll eliminate all of the guesswork and it will naturally take into account any special cases that you might have that we can't possibly know.

dominic hamon