For a systems software development course, I'm working on a complete assembler for an instructor-invented assembly language. Currently I'm working on the tokenizer. While doing some searching, I've come across the Java StringTokenizer
class...but I see that it has been essentially deprecated. It seems far easier to use, however, than the String.split
method with regular expressions.
Is there some reason that I should avoid using it? Is there perhaps something else within the typical Java libraries that would suit this task well that I am not aware of?
EDIT: Giving more detail.
The reason I am considering String.split
complicated is that my knowledge of regular expressions is roughly that I know of them. While it would be helpful for my general knowledge as a software developer to know them, I'm not sure that I want to invest the time right now, especially if there is an easier alternative present.
In terms of my usage of the tokenizer: it will go through a text file containing assembly code and break it into tokens, passing the text and token type to a parser. Delimiters include white space (spaces, tabs, newlines), the comment-start character '|' (which can occur on its own line, or after other text), and the comma to separate operands in an instruction.
I would write that more mathematically, but my knowledge of formal languages is a bit rusty.
EDIT 2: Asking question more clearly
I have seen the documentation on the StringTokenizer class. It would have suited my purposes well, but its use is discouraged. Other than String.split
, is there something within the standard java libraries that would be helpful?