ansaurus

Question

Python MediaWiki table regex (find strings of a particular format, then extract substrings within)

Answer 1

+1 A:

re.match('^{{[^|]+\|([^|]+)\|[^|]+\|([^|]+)\|[^|]+\|[^|]+\|[^|]+\}}$', S).groups()

Ignacio Vazquez-Abrams 2010-02-12 02:46:15

Answer 2

+1 A:

import re
text="""{{rdex|001|001|Bulbasaur|2|Grass|Poison}}"""
re.findall("\{\{[^|]+\|(\d+)\|\d+\|([^|]+)",text)
[('001', 'Bulbasaur')]

S.Mark 2010-02-12 02:49:27

that is some fly regex right there. might i ask, where did you learn it? was it from a book/internet tutorial/divine gift? many thanks!

Beau Martínez 2010-02-12 03:07:08

MSDN's regular expressions syntax page was my first impression on regex http://msdn.microsoft.com/en-us/library/1400241x(VS.85).aspx

S.Mark 2010-02-12 03:24:23

Answer 3

A:

line="{{rdex|001|001|Bulbasaur|2|Grass|Poison}}"
s=line.find("{{")
e=line.find("}}")
if s != -1 and e != -1:
    sub=line[s+2:e].split("|")
    print sub[1],sub[3]

output

$ ./python.py
001 Bulbasaur

ghostdog74 2010-02-12 02:58:58

ansaurus

tags:

views:

answers:

Python MediaWiki table regex (find strings of a particular format, then extract substrings within)

related questions