ansaurus

Question

php 5.3, regex to get specific data from a string based on preceding text and <br>

Answer 1

+1 A:

Don't use nl2br; it's unnecessary.

$string = ...;
preg_match('/^PASSWORD = (.*)$/m', $string, $matches);

The result will be in $matches[1]. This will allow the password to be in any line. If you only want to match it in the first line, do:

preg_match('/^PASSWORD = (.*)/', $string, $matches);

See preg_match and the meaning of the modifiers.

Artefacto 2010-06-19 20:18:16

thanks Artefacto, it gives me the same result i have had all day: $string = htmlentities(nl2br($client_password_dirty)); preg_match_all('/^PASSWORD = (.*)$/im', $string, $matches); echo $matches[1];the output just shows "Array"... any sugestions?

Nick 2010-06-19 20:22:53

@Nick Do it before the htmlentities and nl2br otherwise, even changing the regex, you may not get the result you want. See http://codepad.viper-7.com/FAjOFq

Artefacto 2010-06-19 20:26:19

Answer 2

+1 A:

It's possible to do this entirely with MySQL:

SELECT SUBSTR(field, LOCATE('= ', field) + 2, LOCATE('\n', field) - LOCATE('= ', field) - 2) AS password FROM table;

You can also create a view that has the password as an additional column:

CREATE VIEW table2 AS SELECT *, SUBSTR(field, LOCATE('= ', field) + 2, LOCATE('\n', field) - LOCATE('= ', field) - 2) AS password FROM table;

Now you can SELECT password FROM table2; to get the field directly

However, if you control the schema of that table, you should really reconsider how you're storing data; you should pretty much never need to pull something from a database and then extract substrings from it, just store that data as separate fields

Michael Mrozek 2010-06-19 20:24:31

Interesting. Had considered trying this at query, but want to keep DB calls to a minimum.

Nick 2010-06-19 20:38:36

@Nick It should be the same number of queries, you're either selecting 'field' or 'password' (or both, but it's still one query)

Michael Mrozek 2010-06-19 20:49:52

aha, indeed but we already have the data in our main query that pulls other info

Nick 2010-06-20 16:35:00

Answer 3

+1 A:

Not that hard.

/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU

the match will contain 'PASSWORD' if it's found in the text:

$text = YOUR_BLOB_HERE;

$match = array();
$count = preg_match('/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU', $text, $match);

print_r($match);

/*  output:

Array
(
    [0] => PASSWORD = dsl3£$Roid

    [PASSWORD] => dsl3£$Roid
    [1] => dsl3£$Roid
)

*/

Kris 2010-06-19 20:24:45

No nl2br needed. you should do this in mysql like Michael Mrozek suggests, but i did the regex thing because `regex` was in the actual question.

Kris 2010-06-19 20:26:18

Yes!!!!Thankyou!FYI here is my (now) working code: $text = $password_dirty; $match = array(); $count = preg_match('/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU', $text, $match); $clean_password = $match[1];

Nick 2010-06-19 20:29:22

There's a lot of excess baggage in that regex. The `?` after the capturing group makes it optional, and I don't see that among the requirements. The `m` modifier changes the meaning of the anchors, `^` and `$`, but you aren't using them. Similarly, `s` changes the meaning of `.`, but you aren't using that either. And `U` makes quantifiers non-greedy by default, which is not helpful in this case. The `U` modifier should be avoided in any case; it's supposed to make regexes more readable, but on the whole it has the opposite effect.

Alan Moore 2010-06-19 21:35:28

To continue Alan Moore's comment... The matching of literal `PASSWORD` should be anchored to the beginning of the subject string, as per the question. The `\s` matches more than the space character. The named capturing group is entirely superfluous and may cloud the simple task that is being performed, particularly for folks not versed in its use. Matching non-newline characters followed by a new line might as well have been replaced by `.+` assuming the needless `s` modifier was dropped.

salathe 2010-06-19 21:52:34

@Alan Moore: Technically you are correct. However I've chosen to do the /msU thing for all my regexes to keep consistent and make writing them less error prone (an most of all reading them by others). @salathe: spec does not state all passwords will be at the beginning of the text. user should check `isset($match['PASSWORD'])` before using it.

Kris 2010-06-19 22:25:34

@Kris: (re consistency) I suspected that was the reason, but I still think it's a bad idea, especially in the case of the `U` modifier. It changes the meaning of the regex drastically, and even regex experts are likely to be thrown by it. I don't know of any other flavor that supports such a feature, and even in the PHP world it seems to be seldom used.

Alan Moore 2010-06-20 00:54:04

@Alan Moore, unfortunately I never really had the pleasure of dealing with experts (nor do I claim to be one), across a previous teams i've worked with it was decided to do it this way and that just stuck.

Kris 2010-06-20 10:19:29

Answer 4

A:

FYI hee is the working code:

$match = array();
preg_match('/PASSWORD\s=\s(?P<PASSWORD>[^\n]+)?\n/msU', $password_dirty, $match);
$password = $match[1];

The missing part was obviously to declare $match as an array at the start.

Nick 2010-06-20 16:39:54

ansaurus

tags:

views:

answers:

php 5.3, regex to get specific data from a string based on preceding text and <br>

related questions