views:

40

answers:

4

hi, I have the following string example:

PASSWORD = dsl3£$Roid
dfdfdf
fgdgfdfg
fgdfgdfg
dfgdfg
dfgdfgdfg

its stored in MySql Text field.

I need to get the "password" ie dsl3£$Roid

Its preceded by PASSWORD = and there is a \n line break after. I can use nl2br to make the string like this:

PASSWORD = dsl3£$Roid<br>
dfdfdf<br>
fgdgfdfg<br>
fgdfgdfg<br>
dfgdfg<br>
dfgdfgdfg<br>

Been fighting with preg_match all day but no luck... Currently have 10k + rows, each with unique passwords and need some code to pull just the password.

Many thanks for help!

+1  A: 

Don't use nl2br; it's unnecessary.

$string = ...;
preg_match('/^PASSWORD = (.*)$/m', $string, $matches);

The result will be in $matches[1]. This will allow the password to be in any line. If you only want to match it in the first line, do:

preg_match('/^PASSWORD = (.*)/', $string, $matches);

See preg_match and the meaning of the modifiers.

Artefacto
thanks Artefacto, it gives me the same result i have had all day: $string = htmlentities(nl2br($client_password_dirty)); preg_match_all('/^PASSWORD = (.*)$/im', $string, $matches); echo $matches[1];the output just shows "Array"... any sugestions?
Nick
@Nick Do it before the htmlentities and nl2br otherwise, even changing the regex, you may not get the result you want. See http://codepad.viper-7.com/FAjOFq
Artefacto
+1  A: 

It's possible to do this entirely with MySQL:

SELECT SUBSTR(field, LOCATE('= ', field) + 2, LOCATE('\n', field) - LOCATE('= ', field) - 2) AS password FROM table;

You can also create a view that has the password as an additional column:

CREATE VIEW table2 AS SELECT *, SUBSTR(field, LOCATE('= ', field) + 2, LOCATE('\n', field) - LOCATE('= ', field) - 2) AS password FROM table;

Now you can SELECT password FROM table2; to get the field directly

However, if you control the schema of that table, you should really reconsider how you're storing data; you should pretty much never need to pull something from a database and then extract substrings from it, just store that data as separate fields

Michael Mrozek
Interesting. Had considered trying this at query, but want to keep DB calls to a minimum.
Nick
@Nick It should be the same number of queries, you're either selecting 'field' or 'password' (or both, but it's still one query)
Michael Mrozek
aha, indeed but we already have the data in our main query that pulls other info
Nick
+1  A: 

Not that hard.

/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU

the match will contain 'PASSWORD' if it's found in the text:

$text = YOUR_BLOB_HERE;

$match = array();
$count = preg_match('/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU', $text, $match);

print_r($match);

/*  output:

Array
(
    [0] => PASSWORD = dsl3£$Roid

    [PASSWORD] => dsl3£$Roid
    [1] => dsl3£$Roid
)

*/
Kris
No nl2br needed. you should do this in mysql like Michael Mrozek suggests, but i did the regex thing because `regex` was in the actual question.
Kris
Yes!!!!Thankyou!FYI here is my (now) working code: $text = $password_dirty; $match = array(); $count = preg_match('/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU', $text, $match); $clean_password = $match[1];
Nick
There's a lot of excess baggage in that regex. The `?` after the capturing group makes it optional, and I don't see that among the requirements. The `m` modifier changes the meaning of the anchors, `^` and `$`, but you aren't using them. Similarly, `s` changes the meaning of `.`, but you aren't using that either. And `U` makes quantifiers non-greedy by default, which is not helpful in this case. The `U` modifier should be avoided in any case; it's supposed to make regexes more readable, but on the whole it has the opposite effect.
Alan Moore
To continue Alan Moore's comment... The matching of literal `PASSWORD` should be anchored to the beginning of the subject string, as per the question. The `\s` matches more than the space character. The named capturing group is entirely superfluous and may cloud the simple task that is being performed, particularly for folks not versed in its use. Matching non-newline characters followed by a new line might as well have been replaced by `.+` assuming the needless `s` modifier was dropped.
salathe
@Alan Moore: Technically you are correct. However I've chosen to do the /msU thing for all my regexes to keep consistent and make writing them less error prone (an most of all reading them by others). @salathe: spec does not state all passwords will be at the beginning of the text. user should check `isset($match['PASSWORD'])` before using it.
Kris
@Kris: (re consistency) I suspected that was the reason, but I still think it's a bad idea, especially in the case of the `U` modifier. It changes the meaning of the regex drastically, and even regex experts are likely to be thrown by it. I don't know of any other flavor that supports such a feature, and even in the PHP world it seems to be seldom used.
Alan Moore
@Alan Moore, unfortunately I never really had the pleasure of dealing with experts (nor do I claim to be one), across a previous teams i've worked with it was decided to do it this way and that just stuck.
Kris
A: 

FYI hee is the working code:

$match = array();
preg_match('/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU', $password_dirty, $match);
$password = $match[1];

The missing part was obviously to declare $match as an array at the start.

Nick