tags:

views:

50

answers:

2

I'm trying to get mysql return the number of times a regex matches.

something like:

select 'aabbccaa' regexp 'a'

should return 4 (4 matches of a), rather than just true (or 1).

any way around it???

thanks !!

+1  A: 

I think that there is no regex engine that will do this. Regular expressions can't count. Of course, most regex dialects have some sort of findall() method, and you can then count the number of matches yourself.

MySQL, however, doesn't have this functionality. The LOCATE function only takes strings, not regexes - otherwise you could have worked with that.

Tim Pietzcker
The need for the the count of matches, is because I wanted to ORDER BY it - and get the top 5 lines that matched most of the times.Maybe a mysql User Defined Funtion would do it?
azv
I understand your problem, but I don't know MySQL well enough to make a suggestion for a workaround, sorry.
Tim Pietzcker
+2  A: 

You could create a function:

delimiter ||
DROP FUNCTION IF EXISTS substrCount||
CREATE FUNCTION substrCount(s VARCHAR(255), ss VARCHAR(255)) RETURNS TINYINT(3) UNSIGNED LANGUAGE SQL NOT DETERMINISTIC READS SQL DATA
BEGIN
DECLARE count TINYINT(3) UNSIGNED;
DECLARE offset TINYINT(3) UNSIGNED;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET s = NULL;

SET count = 0;
SET offset = 1;

REPEAT
IF NOT ISNULL(s) AND offset > 0 THEN
SET offset = LOCATE(ss, s, offset);
IF offset > 0 THEN
SET count = count + 1;
SET offset = offset + 1;
END IF;
END IF;
UNTIL ISNULL(s) OR offset = 0 END REPEAT;

RETURN count;
END;

||
delimiter ;

Which you can call then like this

SELECT substrCount('aabbccaa', 'a') `count`;
JochenJung
Unfortunately, it looks like 'a' was just a test, and future ones would actually need to work with legitimate regular expressions
zebediah49
Then it would be good to know, what the RegExp should look like. Maybe one can put its logic to this function using Locate()
JochenJung