Hi all,
I've got a (proprietary) output from a software that I need to parse. Sadly, there are unescaped user names and I'm scratching my hairs trying to know if I can, or not, describe the files I need to parse using a BNF (or EBNF or ABNF).
The problem, oversimplified (it's really just an example), may look like this:
(data) ::= <username>
<username> ::= (other type of data)
And in some case, instead of appearing at the left or at the right, the username can also appear in the middle of a line.
The problem is that the username is unescaped and there are not enough restrictions on user names (they're printable ASCII, max 20 chars and they can't contain line break). So "=" would be a perfectly valid username, for example. And so would "= 1 = john = 2" (because user, at sign-on, where allowed to choose any user name they wanted and these appear unescaped in the output I've got).
I'm asking because my parser chocked on some very creative usernames (once again, not in my control, they're "weird" and I need to deal with it) and I cannot find an easy way to deal with this. Also note that I do not know in advance the user names (for example I don't have access to a database that would contain all the user names that the users created).
So are unrestricted and unescaped user names incompatibles with BNF?
P.S: be cool with me if I made mistakes, it's my first post on stackoverflow :)