views:

678

answers:

6

For example, if I had the following string:

"this-is-a-string"

Could I split it by every 2nd "-" rather than every "-" so that it returns two values ("this-is" and "a-string") rather than returning four?

A: 
l = 'this-is-a-string'.split()
nl = []
ss = ""
c = 0
for s in l:
   c += 1
   if c%2 == 0:
       ss = s
   else:
       ss = "%s-%s"%(ss,s)
       nl.insert(ss)

print nl
SpliFF
What's n? I get a name error as it's not defined.
Gnuffo1
sorry, i misread your question first time and rewrote it, n was a leftover from previous. Now it gives a list of strings.
SpliFF
This is very complicated (long to read/decipher), compared to many of the other solutions proposed here…
EOL
rubbish. it's actually much easier to decipher. length is largely irrelevant and it could be shortened by making it less readable. It should have good performance since the loop only has a simple test condition to deal with. Also it has the most flexibility for handling other processing inside the loop. Also the winning answer will crash on a string with an odd number of hyphens. Iter and list ops might be pythonic but that doesn't necessarily make them 'better'.
SpliFF
A: 

EDIT: The original code I posted didn't work. This version does:

I don't think you can split on every other one, but you could split on every - and join every pair.

chunks = []
content = "this-is-a-string"
split_string = content.split('-')

for i in range(0, len(split_string) - 1,2) :
    if i < len(split_string) - 1:
        chunks.append("-".join([split_string[i], split_string[i+1]]))
    else:
        chunks.append(split_string[i])
EmFi
This doesn't work, I get `["-", "-", "-", "-"]`
Jed Smith
This does not work. The output consists of a list of 1 character strings containing a hyphen.
recursive
@Jed His idea is good, you could write the implementation your own.
ReDAeR
Yeah. Splice didn't work the way I thought it did, I've fixed the implementatino.
EmFi
Downvote removed. You might as well just do split_string[i:i+2] rather than creating a list literal, since you know the size already.
recursive
+14  A: 

Here’s another solution:

span = 2
words = "this-is-a-string".split("-")
print ["-".join(words[i:i+span]) for i in range(0, len(words), span)]
Gumbo
Thanks, Nick D.
Gumbo
Why the down vote? What’s wrong with this answer?
Gumbo
This seems the simplest for working for a variable length between seperations.
Gnuffo1
+8  A: 

Regular expressions handle this easily:

import re
s = "aaaa-aa-bbbb-bb-c-ccccc-d-ddddd"
print re.findall("[^-]+-[^-]+", s)

Output:

['aaaa-aa', 'bbbb-bb', 'c-ccccc', 'd-ddddd']

Update for Nick D:

n = 3
print re.findall("-".join(["[^-]+"] * n), s)

Output:

['aaaa-aa-bbbb', 'bb-c-ccccc']
recursive
Probably the most elegant solution which is still readable, the rest are stretching it.
Jed Smith
good answer but it's only for every 2nd separator.
Nick D
… and only for an even number of words.
Gumbo
Nick: Not so. See my update.Gumbo: Also not so. Just a simple change to the regex will handle that case as well if it is desired.
recursive
@recursive, ok but I don't see the `d-ddddd` in the output ;-)
Nick D
sorry, have to -1, too complicated, uses regex,
hasen j
+15  A: 
>>> s="a-b-c-d-e-f-g-h-i-j-k-l"         # use zip(*[i]*n)
>>> i=iter(s.split('-'))                # for the nth case    
>>> map("-".join,zip(i,i))    
['a-b', 'c-d', 'e-f', 'g-h', 'i-j', 'k-l']

>>> i=iter(s.split('-'))
>>> map("-".join,zip(*[i]*3))
['a-b-c', 'd-e-f', 'g-h-i', 'j-k-l']
>>> i=iter(s.split('-'))
>>> map("-".join,zip(*[i]*4))
['a-b-c-d', 'e-f-g-h', 'i-j-k-l']

Sometimes itertools.izip is faster as you can see in the results

>>> from itertools import izip
>>> s="a-b-c-d-e-f-g-h-i-j-k-l"
>>> i=iter(s.split("-"))
>>> ["-".join(x) for x in izip(i,i)]
['a-b', 'c-d', 'e-f', 'g-h', 'i-j', 'k-l']

Here is a version that sort of works with an odd number of parts depending what output you desire in that case. You might prefer to trim the '-' off the end of the last element with .rstrip('-') for example.

>>> from itertools import izip_longest
>>> s="a-b-c-d-e-f-g-h-i-j-k-l-m"
>>> i=iter(s.split('-'))
>>> map("-".join,izip_longest(i,i,fillvalue=""))
['a-b', 'c-d', 'e-f', 'g-h', 'i-j', 'k-l', 'm-']

Here are some timings

$ python -m timeit -s 'import re;r=re.compile("[^-]+-[^-]+");s="a-b-c-d-e-f-g-h-i-j-k-l"' 'r.findall(s)'
100000 loops, best of 3: 4.31 usec per loop

$ python -m timeit -s 'from itertools import izip;s="a-b-c-d-e-f-g-h-i-j-k-l"' 'i=iter(s.split("-"));["-".join(x) for x in izip(i,i)]'
100000 loops, best of 3: 5.41 usec per loop

$ python -m timeit -s 's="a-b-c-d-e-f-g-h-i-j-k-l"' 'i=iter(s.split("-"));["-".join(x) for x in zip(i,i)]'
100000 loops, best of 3: 7.3 usec per loop

$ python -m timeit -s 's="a-b-c-d-e-f-g-h-i-j-k-l"' 't=s.split("-");["-".join(t[i:i+2]) for i in range(0, len(t), 2)]'
100000 loops, best of 3: 7.49 usec per loop

$ python -m timeit -s 's="a-b-c-d-e-f-g-h-i-j-k-l"' '["-".join([x,y]) for x,y in zip(s.split("-")[::2], s.split("-")[1::2])]'
100000 loops, best of 3: 9.51 usec per loop
gnibbler
+1 Nice, clean solution...
ChristopheD
Wow, that's great!
unutbu
pythonic elegance
elzapp
You’re using the wrong code for my proposal. I’m operating on the words an not the string. `python -m timeit -s 's="a-b-c-d-e-f-g-h-i-j-k-l".split("-")' '["-".join(s[i:i+2]) for i in range(0, len(s), 2)]'`
Gumbo
Nicely done, but fails for an odd number of elements. It shouldn't be too hard to overcome though.
RedGlyph
@Gumbo, sorry, I fixed it to match your comment, I've just moved the `split()` out of the setup clause and used `t` as a temporary variable
gnibbler
A: 

I think several of the already given solutions are good enough, but just for fun, I did this version:

def twosplit(s,sep):
  first=s.find(sep)
  if first>=0:
    second=s.find(sep,first+1)
      if second>=0:
        return [s[0:second]] + twosplit(s[second+1:],sep)
      else:
        return [s]
    else:
      return [s]
  print twosplit("this-is-a-string","-")
elzapp