Here's a simple algorithm:
- Determine if the string begins with a
'"'
character
- Split the string into an array delimited by the
'"'
character.
- Mark the quoted commas with a placeholder
#COMMA#
- If the input starts with a
'"'
, mark those items in the array where the index % 2 == 0
- Otherwise mark those items in the array where the index % 2 == 1
- Concatenate the items in the array to form a modified input string.
- Split the string into an array delimited by the
','
character.
- Replace all instances in the array of
#COMMA#
placeholders with the ','
character.
- The array is your output.
Heres the python implementation:
(fixed to handle '"a,b",c,"d,e,f,h","i,j,k"')
def parse_input(input):
quote_mod = int(not input.startswith('"'))
input = input.split('"')
for item in input:
if item == '':
input.remove(item)
for i in range(len(input)):
if i % 2 == quoted_mod:
input[i] = input[i].replace(",", "#COMMA#")
input = "".join(input).split(",")
for item in input:
if item == '':
input.remove(item)
for i in range(len(input)):
input[i] = input[i].replace("#COMMA#", ",")
return input
# parse_input('a,"string, with",various,"values, and some",quoted')
# -> ['a,string', ' with,various,values', ' and some,quoted']
# parse_input('"a,b",c,"d,e,f,h","i,j,k"')
# -> ['a,b', 'c', 'd,e,f,h', 'i,j,k']