ansaurus

Question

sequential strpos() faster than a function with one preg_match?

Answer 1

+5 A:

The correct syntax is:

preg_match('/hello|i am|dumb/', $ohreally);

I doubt there's much in it either way but it wouldn't surprise me if the strpos() method is faster depending on the number of strings you're searching for. The performance of strpos() will degrade as the number of search terms increases. The regex probably will to but not as fast.

Obviously regular expressions are more powerful. For example if you wanted to match the word "dumb" but not "dumber" then that's easily done with:

preg_match('/\b(hello|i am|dumb)\b/', $ohreally);

which is a lot harder to do with strpos().

Note: technically \b is a zero-width word boundary. "Zero-width" means it doesn't consume any part of the input string and word boundary means it matches the start of the string, the end of the string, a transition from word (digits, letters or underscore) characters to non-word characters or a transition from non-word to word characters. Very useful.

Edit: it's also worth noting that your usage of strpos() is incorrect (but lots of people make this same mistake). Namely:

if (strpos ($ohreally, 'hello')) {
  ...
}

will not enter the condition block if the needle is at position 0 in the string. The correct usage is:

if (strpos ($ohreally, 'hello') !== false) {
  ...
}

because of type juggling. Otherwise 0 is converted to false.

cletus 2010-01-19 12:58:10

thank you for this, I'm going to do a microtime() test on your correct code and report back with the results : )

Mohammad 2010-01-19 13:19:08

thank you for reminding me of the strpos caveat, i had forgotten it! i'll have to look through my code and see if there are any fixes needed. the preg_match code you provided works fine! Thank you so much for all the information, it proved very useful :D I edited the answer above to reflect the test i did which proves your statement about strpos() degrading as the number of search terms goes up.

Mohammad 2010-01-19 15:14:48

Answer 2

+2 A:

Crazy idea, but why not test both 'n' thousand times in two separate loops, both surrounded by microtime(); and the associated debug output.

Based on the above code (with a few corrections) for 1,000 iterations, I get something like:

strpos test: 0.003315
preg_match test: 0.014241

As such, in this instance (with the limitations outlined by others) strpos indeed seems faster, albeit by a largely meaningless amount. (The joy of pointless micro-optimisation, etc.)

Never estimate what you can measure.

middaparka 2010-01-19 13:03:28

Thank you so much, you are correct. strpos() is usually the winner! especially in my case which i actually know the occurrence probability of the needles in the string, by placing the most probable one first i gain a super advantage of lessening the times strpos() is actually called :] please read the update I have posted in the question that reflects my test results. And thank you!

Mohammad 2010-01-19 15:17:16

Answer 3

+1 A:

It depends on the number of strings you want to look for and the length of the string you are searching.

You'd need to experiment with a representative data set to find out which is true (repeat the operation, say 1000 times and measure the time delay).

BTW - I think the regex you are looking for is '(hello|i am|dumb)'

Also, your code is more verbose than it needs to be:

return strpos($ohreally, 'hello') || strpos($ohreally, 'i am') || strpos($ohreally, 'dumb');

or

return preg_match('(hello|i am|dumb)',$ohreally);

Also, by all the usual coding standards, there should not be a space between the function name and the bracket.

C.

symcbean 2010-01-19 13:03:48

You have a couple of errors, namely the OP's same problem of checking `strpos()` results (see my answer) and you're not delimiting your regex in `preg_match()`.

cletus 2010-01-19 13:08:02

ansaurus

tags:

views:

answers:

sequential strpos() faster than a function with one preg_match?

related questions