I have a string like this that I need to parse into a 2D array:
str = "'813702104[813702106]','813702141[813702143]','813702172[813702174]'"
the array equiv would be:
arr[0][0] = 813702104
arr[0][1] = 813702106
arr[1][0] = 813702141
arr[1][1] = 813702143
#... etc ...
I'm trying to do this by REGEX. The string above is buried in an HTML page but I can be certain it's the only string in that pattern on the page. I'm not sure if this is the best way, but it's all I've got right now.
imgRegex = re.compile(r"(?:'(?P<main>\d+)\[(?P<thumb>\d+)\]',?)+")
If I run imgRegex.match(str).groups()
I only get one result (the first couplet). How do I either get multiple matches back or a 2d match object (if such a thing exists!)?
Note: Contrary to how it might look, this is not homework
Note part deux: The real string is embedded in a large HTML file and therefore splitting does not appear to be an option.
I'm still getting answers for this, so I thought I better edit it to show why I'm not changing the accepted answer. Splitting, though more efficient on this test string, isn't going to extract the parts from a whole HTML file. I could combine a regex and splitting but that seems silly.
If you do have a better way to find the parts from a load of HTML (the pattern \d+\[\d+\]
is unique to this string in the source), I'll happily change accepted answers. Anything else is academic.