tags:

views:

71

answers:

7

Hi All

I'm trying to extract/match data from a string using regular expression but I don't seem to get it.

I wan't to extract from the following string the i386 (The text between the last - and .iso):

/xubuntu/daily/current/lucid-alternate-i386.iso

This should also work in case of:

/xubuntu/daily/current/lucid-alternate-amd64.iso

And the result should be either i386 or amd64 given the case.

Thanks a lot for your help.

+1  A: 
r"/([^-]*)\.iso/"

The bit you want will be in the first capture group.

Amber
Thanks!! But Doesn't work :(
Were you trying to use `match()` or `search()`? Since this is a partial-match pattern, it should be used with `search()` not `match()` (since `match()` attempts to match the entire string, not just a portion).
Amber
+1  A: 

First off, let's make our life simpler and only get the file name.

>>> os.path.split("/xubuntu/daily/current/lucid-alternate-i386.iso")
('/xubuntu/daily/current', 'lucid-alternate-i386.iso')

Now it's just a matter of catching all the letters between the last dash and the '.iso'.

badp
I still face the problem that can't get to extract the desired text :( (I've never been good with regexp)
+3  A: 

You could also use split in this case (instead of regex):

>>> str = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> str.split(".iso")[0].split("-")[-1]
'i386'

split gives you a list of elements on which your string got 'split'. Then using Python's slicing syntax you can get to the appropriate parts.

ChristopheD
Awesome!! Thank you.
`str.rsplit('.iso', 1)[0].rsplit('-', 1)[-1]`
J.F. Sebastian
`str.rpartition('.iso')[0].rpartition('-')[-1]`
J.F. Sebastian
+1  A: 

If you will be matching several of these lines using re.compile() and saving the resulting regular expression object for reuse is more efficient.

s1 = "/xubuntu/daily/current/lucid-alternate-i386.iso"
s2 = "/xubuntu/daily/current/lucid-alternate-amd64.iso"

pattern = re.compile(r'^.+-(.+)\..+$')

m = pattern.match(s1)
m.group(1)
'i386'

m = pattern.match(s2)
m.group(1)
'amd64'
Peter McGrattan
you don't need regexs for this http://stackoverflow.com/questions/2925306/python-matching-some-characters-into-a-string/2925399#2925399
J.F. Sebastian
I know but it is tagged python and regex
Peter McGrattan
A: 

The expression should be without the leading trailing slashes.

import re

line = '/xubuntu/daily/current/lucid-alternate-i386.iso'
rex = re.compile(r"([^-]*)\.iso")
m = rex.search(line)
print m.group(1)

Yields 'i386'

koblas
A: 
reobj = re.compile(r"(\w+)\.iso$")
match = reobj.search(subject)
if match:
    result = match.group(1)
else:
    result = ""

Subject contains the filename and path.

Turtle
A: 
>>> import os
>>> path = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> file, ext = os.path.splitext(os.path.split(path)[1])
>>> processor = file[file.rfind("-") + 1:]
>>> processor
'i386'
manifest