tags:

views:

67

answers:

2

Is there any regular expression to match this:

  • a continuous string of characters or/and digits XOR
  • a string of any characters between a pairquotation marks (" XOR ')including nested quotations

?

Examples:

  • dgsdggsgggdggsggsd
  • 'dsfsasf .asgafaasfafw rq'
  • "sadas fa fasfa "
A: 

Maybe you can try this:

>>> message = "blabla df qdsf dqsf \"fqdfdqsfsdf  fdqs fqdsf\""
>>> pattern = "(\w+|'.*[^']'|\".*[^\"]\")"
>>> re.findall(pattern, message)
['blabla', 'df', 'qdsf', 'dqsf', '"fqdfdqsfsdf  fdqs fqdsf"']
Antoine Pelisse
it doesn't actually behave as you said. the last quotation mark is omitted from the result. also I would like no to get the quotation in the result. ex "safasfaf" becomes safasfaf
nikita.utiu
doesn't matter, got it working, changed the pattern to (\w+|'.*[^']'|\".*[^\"]\")
nikita.utiu
+1  A: 

Perhaps relevant: do you know about the shlex module?

ΤΖΩΤΖΙΟΥ