I'm trying to create a discriminated union for part of speech tags and other labels returned by a natural language parser.
It's common to use either strings or enums for these in C#/Java, but discriminated unions seem more appropriate in F# because these are distinct, read-only values.
In the language reference, I found that this symbol
``...``
can be used to delimit keywords/reserved words. This works for
type ArgumentType =
| A0 // subject
| A1 // indirect object
| A2 // direct object
| A3 //
| A4 //
| A5 //
| AA //
| ``AM-ADV``
However, the tags contain symbols like $, e.g.
type PosTag =
| CC // Coordinating conjunction
| CD // Cardinal Number
| DT // Determiner
| EX // Existential there
| FW // Foreign Word
| IN // Preposision or subordinating conjunction
| JJ // Adjective
| JJR // Adjective, comparative
| JJS // Adjective, superlative
| LS // List Item Marker
| MD // Modal
| NN // Noun, singular or mass
| NNP // Proper Noun, singular
| NNPS // Proper Noun, plural
| NNS // Noun, plural
| PDT // Predeterminer
| POS // Possessive Ending
| PRP // Personal Pronoun
| PRP$ //$ Possessive Pronoun
| RB // Adverb
| RBR // Adverb, comparative
| RBS // Adverb, superlative
| RP // Particle
| SYM // Symbol
| TO // to
| UH // Interjection
| VB // Verb, base form
| VBD // Verb, past tense
| VBG // Verb, gerund or persent participle
| VBN // Verb, past participle
| VBP // Verb, non-3rd person singular present
| VBZ // Verb, 3rd person singular present
| WDT // Wh-determiner
| WP // Wh-pronoun
| WP$ //$ Possessive wh-pronoun
| WRB // Wh-adverb
| ``#``
| ``$``
| ``''``
| ``(``
| ``)``
| ``,``
| ``.``
| ``:``
| `` //not sure how to escape/delimit this
``...``
isn't working for WP$ or symbols like (
Also, I have the interesting problem that the parser returns `` as a meaningful symbol, so I need to escape it as well.
Is there some other way to do this, or is this just not possible with a discriminated union?
Right now I'm getting errors like
- Invalid namespace, module, type or union case name
- Discriminated union cases and exception labels must be uppercase identifiers
I suppose I could somehow override toString for these goofy cases and replace the symbols with some alphanumeric equivalent?