tags:

views:

200

answers:

5

Pretty simple but I cant get the exact syntax working.

I just want a true or false check to see if a string beings with 'for the' (case insensitive).

Thanks everyone.

A: 

If you had read the first example in the documentation you would have seen the answer.

if ( preg_match('/^for the/i', $sentence) )
{
    // a match was found
}
TravisO
+10  A: 

If it's just that, then you could use plain text searching:

if (stripos("for the", $text) === 0) { // case-insensitive here
    // string starts with "for the"
}

Or,

if (substr($text, 0, 7) == "for the")

The comments below got me wondering about which is actually faster, so I wrote some benchmarking.

Here's the TLDR version:

  • strpos is really fast if you're not working with large strings.
  • strncmp is reliable and fast.
  • preg_match is never a good option.

Here's the long version:

  • Two different "haystacks":
    1. 10000 characters of lipsum
    2. 83 characters of lipsum.
  • 5 different searching methods:
    1. strpos:
      return strpos($haystack, $needle) === 0
    2. preg_match
      return preg_match("/^$needle/", $haystack) === 1
    3. substr
      return substr($haystack, 0, strlen($needle)) === $needle
    4. strncmp
      return strncmp($needle, $haystack, strlen($needle)) === 0
    5. Manual looping:
for ($i = 0, $l = strlen($needle); $i < $l; ++$i) {
    if ($needle{$i} !== $haystack{$i}) return false;
}
return true;
  • 7 different "needles"
    • 3 matching (lengths: 83, 5 and 1 character)
    • 4 non-matching (lengths: 83, 82, 5 and 1 characters). The 82 char needle doesn't match at all, and the 83 character needles matches all but the last character.
  • 100,000 iterations, per method per needle per haystack

Interesting points:

  • The fastest individual test of all was strpos on the long, entirely non-matching needle against the short haystack.
    • In fact, out of the 14 tests run per method, strpos recorded the top 11 times.
  • The slowest individual test was the manual method on the long needles, regardless of haystack size. Those four tests were 10-20 times slower than almost all the other tests.
  • Though strpos had the best performance, it was weighed down by the long non-matching needles on the long haystack. They were 5-10 times slower than most tests.
  • strncmp was fast and the most consistent.
  • preg_match was consistently about 2 times slower than the other functions
Haystack: 83 characters
              ______________________________________________________________
 ____________|__________ non-matching ___________|_______  matching ________|
| function   |   1    |   5    |   82   |   83   |   1    |   5    |   83   |
|------------+--------+--------+--------+--------+--------+--------+--------|
| manual     | 0.2291 | 0.2222 | 0.2266 | 4.1523 | 0.2337 | 0.4263 | 4.1972 |
| preg_match | 0.3622 | 0.3792 | 0.4098 | 0.4656 | 0.3642 | 0.3694 | 0.4658 |
| strncmp    | 0.1860 | 0.1918 | 0.1881 | 0.1981 | 0.1841 | 0.1857 | 0.1980 |
| strpos     | 0.1596 | 0.1633 | 0.1537 | 0.1560 | 0.1571 | 0.1589 | 0.1681 |
| substr     | 0.2052 | 0.2066 | 0.2009 | 0.2166 | 0.2061 | 0.2017 | 0.2236 |
-----------------------------------------------------------------------------

Haystack: 10000 characters
              ______________________________________________________________ 
 ____________|__________ non-matching ___________|_______  matching ________|
| function   |   1    |   5    |   82   |   83   |   1    |   5    |   83   |
|------------+--------+--------+--------+--------+--------+--------+--------|
| manual     | 0.2275 | 0.2249 | 0.2278 | 4.1507 | 0.2315 | 0.4233 | 4.1834 |
| preg_match | 0.3597 | 0.3628 | 0.4147 | 0.4654 | 0.3662 | 0.3679 | 0.4684 |
| strncmp    | 0.1886 | 0.1914 | 0.1835 | 0.2014 | 0.1851 | 0.1854 | 0.1989 |
| strpos     | 0.1605 | 2.1877 | 2.3737 | 0.5933 | 0.1575 | 0.1597 | 0.1667 |
| substr     | 0.2073 | 0.2085 | 0.2017 | 0.2152 | 0.2036 | 0.2090 | 0.2183 |
-----------------------------------------------------------------------------
nickf
+1, regular expressions are generally more resource intensive than strpos() and related.
Adam Backstrom
This was my first instinct as well, but it's pretty inefficient for large strings. Even if $text doesn't start with "for the" it'll keep searching the rest of the string, which could be thousands of bytes.
Jordan
@Jordan, isn't the same true of regexes?
Andy E
+1 for reminding that, in simple cases, it's preferable to use the str_ functions rather than the preg_ ones. It's simpler, more readable, and more efficient.
Adriano Varoli Piazza
@Jordan: substr won't do that.
Adriano Varoli Piazza
@Andy E: It might not be true of regexes if the pattern is anchored to the beginning of the string with with ^ - you would hope it would give up more quickly than stripos
Tom Haigh
+1 for sheer effort, plus the information is good too!
Doug Neiner
+1 thorough answer nick!
alex
A: 

the regex is /^for the/i

ghostdog74
A: 

How about

if(preg_match("/^for the/", $yourString))
{
    return true;
}                   
else
{
    return false;
}

Note the code of ^, matches the start of a string.

jakenoble
+3  A: 

You want to use ^ to signify the beginning of a string:

$string_one = "For the love of Mike";
$string_two = "for the amazing reason.";

$match = preg_match("/^for the/i", $string_one); // Outputs 1
$match = preg_match("/^for the/i", $string_two); // Outputs 1

The /i is the part that makes the search case insensitive.

Doug Neiner