ansaurus

Question

Answer 1

+4 A:

If you know you will want all the matches, you could use re.findall. It will return a list of all the matches. Then you can just do len(result) for the number of matches.

JoshD 2010-10-09 04:02:02

It's important to note that `finditer` and `findall` don't always return the same thing for the same regular expression and the same data. `finditer` matches non-overlapping instances of the regex pattern, while `findall` matches all.

Rafe Kettler 2010-10-09 04:05:44

@Rafe Kettler: findall finds non-overlapping. From the documentation: Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found.

JoshD 2010-10-09 04:08:08

dutt 2010-10-09 04:11:09

Okay, I posted my answer anyway. Happy trails.

Rafe Kettler 2010-10-09 04:13:37

Answer 2

+1 A:

If you always need to know the length, and you just need the content of the match rather than the other info, you might as well use re.findall. Otherwise, if you only need the length sometimes, you can use e.g.

matches = re.finditer(...)
...
matches = tuple(matches)

to store the iteration of the matches in a reusable tuple. Then just do len(matches).

Another option, if you just need to know the total count after doing whatever with the match objects, is to use

matches = enumerate(re.finditer(...))

which will return an (index, match) pair for each of the original matches. So then you can just store the first element of each tuple in some variable.

But if you need the length first of all, and you need match objects as opposed to just the strings, you should just do

matches = tuple(re.finditer(...))

intuited 2010-10-09 04:05:47

Answer 3

A:

If you find you need to stick with finditer(), you can simply use a counter while you iterate through the iterator.

Example:

>>> from re import *
>>> pattern = compile(r'.ython')
>>> string = 'i like python jython and dython (whatever that is)'
>>> iterator = finditer(pattern, string)
>>> count = 0
>>> for match in iterator:
        count +=1
>>> count
3

If you need the features of finditer() (not matching to overlapping instances), use this method.

Rafe Kettler 2010-10-09 04:13:12

Yea I thought of doing that but due to things in my "doStuff" code won't work without adding a lot of extra code in various places. Thanks for the tip anyway :)

dutt 2010-10-09 04:38:29

I would use `for count, match in enumerate(iterator):` in the case of Rafe's code.

Tony Veijalainen 2010-10-09 16:47:02

@Tony: thanks, forgot about enumerate. If you do use enumerate, though, it will give you the highest *index*, not the actual number of matches; for that, you'd have to add 1.

Rafe Kettler 2010-10-09 18:20:45

ansaurus

tags:

views:

answers:

Number of regex matches

related questions