EDIT: Two additional token types added.
Hi,
I am trying to parse a line in a mmCIF Protein file into separate tokens using Excel 2000/2003. Worst case it COULD look something like this:
token1 token2 "token's 1a',1b'" 'token4"5"' 12 23.2 ? . 'token' tok'en to"ken
Which should become the following tokens:
token1
token2
token's 1a',1b' (note: the double quotes have disappeared)
token4"5" (note: the single quotes have disappeared)
12
23.2
?
.
token (note: the single quotes have disappeared)
to'ken
to"ken
I am looking to see if a RegEx is even possible to split this kind of line into tokens?
Thanks!
Paul