ansaurus

Question

How to properly add quotes to a string using python?

Answer 1

A:

You have three cases:

String is less than two characters long: add quotes
String has quotes at s[0] and at s[1]: don't add quotes
Add quotes

And by "add quotes" I mean simply construct '"'+string+'"' and return it.

Translate to if-statements, and you're done.

Ian 2010-08-27 12:22:25

Also fails test #5

delnan 2010-08-27 12:27:43

Answer 2

+8 A:

Your problem is inconsistent.

Consider the two cases

""a" b"

"a" "b"

The former is interpreted as a pre-quoted string with 'nested quotes', but the latter is interpreted as separately-quoted strings. Here are some examples that highlight the issue.

" "a" "b" "

" "a" b"

"a ""b"

How should they be treated?

katrielalex 2010-08-27 13:02:58

All the time, I felt like something was wrong with this question... now I know. +1 for bringing it up.

delnan 2010-08-27 13:07:16

I was just typing the same thing. Instead I'll upvote your answer. Another input example: `" a " foo " b "`. Minor clarification: not sure that the question is *inconsistent* so much as *under-specified*.

FM 2010-08-27 13:11:46

@bogdan: removing the spaces, respectively considering only the quotes, the two are equivalent.

delnan 2010-08-27 13:12:12

@FM: Well, it's inconsistent under the assumption that all non-quote characters are treated equally. Adding more specification changes the problem.

katrielalex 2010-08-27 13:16:07

@bogdan: unfortunately that doesn't help: the commands `"a "foo" b"` and `"a" foo "b"` seem indistinguishable but require different rules. Maybe try one and catch the resulting error?

katrielalex 2010-08-27 13:35:55

I made a mistake by trying to simplify the problem too much. Now I updated the question with new examples. It looks that you cannot ignore the spaces if you want to properly detect if the string is already quoted or not.

bogdan 2010-08-27 13:39:53

Answer 3

+2 A:

I think this is a difficult question to specify in a precise way, but perhaps this strategy will approximate your goal.

The basic idea is to create a copy of the original string, removing the internally quoted items. An internally quoted item is defined here so that it must contains at least one non-whitespace character.

After the internally quoted items have been removed, you then check whether the entire string needs surrounding quotes or not.

import re

tests = [
    # Test data in original question.
    ( '',                '""'                ),
    ( 'a',               '"a"'               ),
    ( '"a"',             '"a"'               ), # No change.
    ( '""a" b"',         '""a" b"'           ), # No change.
    ( '"a" b',           '""a" b"'           ),
    ( '"a" "b"',         '""a" "b""'         ),
    ( 'a "b" c',         '"a "b" c"'         ),

    # Test data in latest edits.
    ( 'type',            '"type"'         ),    # Quote these.
    ( '"type" /?',       '""type" /?"'    ),
    ( '"type" "/?"',     '""type" "/?""'  ),
    ( 'type "a a" b',    '"type "a a" b"' ),
    ( 'type "" b',       '"type "" b"'    ),
    ( '"type"',          '"type"'         ),    # Don't quote.
    ( '""type" /?"',     '""type" /?"'    ),

    # Some more tests.
    ( '"a b" "c d"',     '""a b" "c d""'     ),
    ( '" a " foo " b "', '"" a " foo " b ""' ),
]

Q = '"'
re_quoted_items = re.compile(r'" \s* [^"\s] [^"]* \"', re.VERBOSE)

for orig, expected in tests:
    # The orig string w/o the internally quoted items.
    woqi = re_quoted_items.sub('', orig)

    if len(orig) == 0:
        orig_quoted = Q + orig + Q
    elif len(woqi) > 0 and not (woqi[0] == Q and woqi[-1] == Q):
        orig_quoted = Q + orig + Q    
    else:
        orig_quoted = orig

    print orig_quoted == expected

FM 2010-08-27 13:57:22

Answer 4

+1 A:

I wrote a simple state machine to track if we are in a word or not. If the quote depth is ever zero in the string, then we need quotes:

def quotify(s):
    if s == "":
        return '""'

    depth = 0
    in_word = False
    needs_quotes = False
    for c in s:
        if c == '"':
            if in_word:
                depth -= 1
            else:
                depth += 1
        else:
            if depth == 0:
                needs_quotes = True
                break
            in_word = not c.isspace()

    if needs_quotes:
        return '"' + s + '"'
    else:
        return s

assert quotify('') == '""'
assert quotify('''type''') == '''"type"'''
assert quotify('''"type" /?''') == '''""type" /?"'''
assert quotify('''"type" "/?"''') == '''""type" "/?""'''
assert quotify('''type "a a" b''') == '''"type "a a" b"'''
assert quotify('''type "" b''') == '''"type "" b"'''
assert quotify('''"type"''') == '''"type"'''
assert quotify('''""type" /?"''') == '''""type" /?"'''

Ned Batchelder 2010-08-27 14:43:41

I like your approach because it does not require `re`, but it does fail on the last test made by FM - his `re` solution does pass all tests.

bogdan 2010-08-27 14:54:28

ansaurus

tags:

views:

answers:

How to properly add quotes to a string using python?

Your problem is inconsistent.

related questions