ansaurus

Question

Product code looks like abcd2343, what to split by letters and numbers

Answer 1

+3 A:

In [32]: import re

In [33]: s='abcd2343 abw34324 abc3243-23A'

In [34]: re.split('(\d+)',s)
Out[34]: ['abcd', '2343', ' abw', '34324', ' abc', '3243', '-', '23', 'A']

Or, if you want to split on the first occurrence of a digit:

In [43]: re.findall('\d*\D+',s)
Out[43]: ['abcd', '2343 abw', '34324 abc', '3243-', '23A']

unutbu 2010-07-27 01:18:14

Answer 2

A:

def firstIntIndex(string):
    result = -1
    for k in range(0, len(string)):
        if (bool(re.match('\d', string[k]))):
            result = k
            break
    return result

Mike 2010-07-27 01:20:08

Answer 3

A:

import re

m = re.match(r"(?P<letters>[a-zA-Z]+)(?P<the_rest>.+)$",input)

m.group('letters')
m.group('the_rest')

This covers your corner case of abc3243-23A and will output abc for the letters group and 3243-23A for the_rest

Since you said they are all on individual lines you'll obviously need to put a line at a time in input

jwsample 2010-07-27 01:30:11

Answer 4

A:

To partition on the first digit

parts = re.split('(\d.*)','abcd2343')      # => ['abcd', '2343', '']
parts = re.split('(\d.*)','abc3243-23A')   # => ['abc', '3243-23A', '']

So the two parts are always parts[0] and parts[1].

Of course, you can apply this to multiple codes:

>>> s = "abcd2343 abw34324 abc3243-23A"
>>> results = [re.split('(\d.*)', pcode) for pcode in s.split(' ')]
>>> results
[['abcd', '2343', ''], ['abw', '34324', ''], ['abc', '3243-23A', '']]

If each code is in an individual line then instead of s.split( ) use s.splitlines().

Muhammad Alkarouri 2010-07-27 01:33:39

ansaurus

tags:

views:

answers:

Product code looks like abcd2343, what to split by letters and numbers

related questions