views:

190

answers:

3

I have some random string, let's say :

s = "This string has some verylongwordsneededtosplit"

I'm trying to write a function trunc_string(string, len) that takes string as argument to operate on and 'len' as the number of chars after long words will be splitted.

The result should be something like that

str = trunc_string(s, 10)
str = "This string has some verylongwo rdsneededt osplit"

For now I have something like this :

def truncate_long_words(s, num):
"""Splits long words in string"""
words = s.split()
for word in words:
    if len(word) > num:
        split_words = list(words)

After this part I have this long word as a list of chars. Now I need to :

  • join 'num' chars together in some word_part temporary list
  • join all word_parts into one word
  • join this word with the rest of words, that weren't long enough to be splitted.

Should I make it in somehow similar way ? :

counter = 0
for char in split_words:
    word_part.append(char)
    counter = counter+1
    if counter == num

And here I should somehow join all the word_part together creating word and further on

+3  A: 

Why not:

  def truncate_long_words(s, num):
     """Splits long words in string"""
     words = s.split()
     for word in words:
        if len(word) > num:
                for i in xrange(0,len(word),num):
                       yield word[i:i+num]
        else:
            yield word

 for t in truncate_long_words(s):
    print t
Alexander Gessler
+2  A: 

Abusing regex:

import re
def trunc_string(s, num):
   re.sub("(\\w{%d}\\B)" % num, "\\1 ", s)

assert "This string has some verylongwo rdsneededt osplit" == trunc_string("This string has some verylongwordsneededtosplit", 10)

(Edit: adopted simplification by Brian. Thanks. But I kept the \B to avoid adding a space when the word is exactly 10 characters long.)

KennyTM
Simpler: return re.sub('([a-zA-Z]{%d})' % num,'\\1 ',s)
Brian
+2  A: 
def split_word(word, length=10):
    return (word[n:n+length] for n in range(0, len(word), length))

string = "This string has some verylongwordsneededtosplit"

print [item for word in string.split() for item in split_word(word)]
# ['This', 'string', 'has', 'some', 'verylongwo', 'rdsneededt', 'osplit']

Note: it's a bad idea to name your string str. It shadows the built in type.

Matt Anderson
very clean solution, i like it a lot.
Adrien Plisson