I would like to parse a string to obtain a list including all words (hyphenated words, too). Current code is:
s = '-this is. A - sentence;one-word'
re.compile("\W+",re.UNICODE).split(s)
returns:
['', 'this', 'is', 'A', 'sentence', 'one', 'word']
and I would like it to return:
['', 'this', 'is', 'A', 'sentence', 'one-word']