ansaurus

Question

Answer 1

+2 A:

From the documentation:

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

The following example illustrates how the String.split method can be used to break up a string into its basic tokens:

     String[] result = "this is a test".split("\\s");
     for (int x=0; x<result.length; x++)
         System.out.println(result[x]);

prints the following output:

     this
     is
     a
     test

Zak 2010-10-05 19:23:36

+1 for documentation

eiefai 2010-10-05 19:30:40

Right, I came across this as well. It would have been good if I noted that more clearly, but this is what I was referring to when I said "essentially deprecated".

Ryan 2010-10-05 19:41:42

Answer 2

+3 A:

I believe that the java.util.Scanner class has replaced StringTokenizer. Scanner let's you handle tokens one at a time, whereas String.split() will split the entire string (which could be large, if you're parsing a source code file). Using Scanner, you can examine each token, decide what action to take, then discard that token.

Outlaw Programmer 2010-10-05 19:29:12

Generally, you shouldn't be parsing an entire source file at once, but a single source line at a time. It's easier on memory, and it makes it easier to keep track of line numbers for issuing error messages.

Loadmaster 2010-10-05 19:33:44

Answer 3

A:

Something is deprecated when there is a better alternative, or those methods are dangerous in some situations. So the answer is - Yep, you can use it, but there is a better way to achieve what you need.

Btw, what is complicate about split?

Klark 2010-10-05 19:29:17

Answer 4

+2 A:

If what you're building is an assembler, I would use JavaCC for building the parser/compiler.

Kdeveloper 2010-10-05 19:50:20

This would have been an extremely helpful tool, but we were explicitly forbidden to use tools like this. Thank you, though - this is pretty cool!

Ryan 2010-10-05 19:58:02

Answer 5

+1 A:

Don't fear the regex, get yourself a regex editor such as the following eclipse plugin,
http://brosinski.com/regex/update and you'll be able to test the expressions without compiling or even before writing your program.

If you need more reference, here are some very useful sites :

Although I think the suggestion above of using JavaCC sound like the right approach.
Another option would be ANTLR.

Heres a post comparing the experience of ANTLR vs JavaCC.

crowne 2010-10-05 21:11:01

I second this. It won't take you more than 30 minutes to learn enough about regex to effectively use String.split or Scanner. For a programmer learning to write basic regexes is easy and takes very little time. Becoming a master will take you the rest of your career.

Mike Deck 2010-10-05 21:34:14

Although I do still chuckle at the 1997 quote from Jamie Zawinski one of the founders of Netscape and Mozilla.org "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

crowne 2010-10-05 22:07:06

@crowne That quote is a favorite of mine as well.

gregcase 2010-10-06 00:00:28

ansaurus

tags:

views:

answers:

Tokenizing source code in Java

related questions