tags:

views:

167

answers:

4

Consider the following as tokens:

  1. +, -, ), (
  2. alpha charactors and underscore
  3. integer

Implement 1.getToken() - returns a string corresponding to the next token 2.getTokPos() - returns the position of the current token in the input string

Example input: (a+b)-21)
Output: (| a| +| b| )| -| 21| )|

Note: Cannot use the java string tokenizer class

Work in progress - Successfully tokenized +,-,),(. Need to figure out characters and numbers:

OUTPUT: +|-|+|-|(|(|)|)|)|(| |

A: 

If it's not a homework assignment use String.split(). If is a homework assignment, say so and tag it so that we can give the appropriate level of help (I did so for you, just in case...).

Bill K
It is a homework assignment, it says: You may not use jlex or the java string tokenizer class.Therefore, I assume we can use the split function. Thank you very much for the tips
Tom
+3  A: 

java.util tokenizer is a deprecated class.

Tokenizing Strings in Java is much easier with "String.split()" since Java 1.4 :

String[] tokens = "(a+b)-21)".split("[+-)(]");

If it is a homework, you probably have to reimplement a "split" method:

  • read the String character by character
  • if the character is not a special char, add it to a buffer
  • when you encounter a special char, add the buffer content to a list and clear the buffer

Since it is (probably) a homework, I let you implement it.

Benoit Courtine
I think i understand your implementation. However, this will give a,b,21 instead of (,a,+,b,),-,21,) I still can't figure it out... thank you for the replies
Tom
Since the characters used as the split pattern are the delimiters of the split, they will not be returned in the results array. Using string.split unfortunately won't be able to actually get everything. if you split on any of the characters, you lose them. If you split on "", then 21 becomes two tokens, which is still not right (you actually end up with the same problem as before, since you just reduce the string to an array of characters).
Zoe Gagnon
If you also needs the delimiters, you just have to do a little adaptation on my algorithm: "if the character is a special char, add the buffer content to a list, add the 'special char' as the next element of the list, and only then clear the buffer".
Benoit Courtine
A: 

Because the string needs to be cut in several different ways, not just on whitespace or parens, using the String.split method with any of the symbols there will not work. Split removes the character used as a seperator. You could try to split on the empty string, but this wouldn't get compound symbols, like 21. To correctly parse this string, you will need to effectively implement your own tokenizer. Try thinking about how you could tell you had a complete token if you looked at the string one character at a time. You could probably start a string that collects the characters until you have identified a complete token, and then you can remove the characters from the original and return the string. Starting from this point, you can probably make a basic tokenizer.

If you'd rather learn how to make a full strength tokenizer, most of them are defined by creating a regular expression that only matches the tokens.

Zoe Gagnon
+1  A: 

Java lets you examine the characters in a String one by one with the charAt method. So use that in a for loop and examine each character. When you encounter a TOKEN you wrap that token with the pipes and any other character you just append to the output.

public static final char PLUS_TOKEN = '+';
// add all tokens as 

public String doStuff(String input)
{
    StringBuilder output = new StringBuilder();
    for (int index = 0; index < input.length(); index++)
    {
        if (input.charAt(index) == PLUS_TOKEN)
        {
            // when you see a token you need to append the pipes (|) around it
            output.append('|');
            output.append(input.charAt(index);
            output.append('|');
        }
        else if () //compare the current character with all tokens
        else
        {
            // just add to new output
            output.append(input.charAt(index);
        }

    }
    return output.toString();
}
willcodejavaforfood
your amazing. You included formulas that is needed for my assignment and now I have all the tools to make it right. My prof expects us to learn java on our own so writing my first program is harder.
Tom
@Tom - All you needed was a nudge in the right direction, I hope it was not too big of a nudge. :)
willcodejavaforfood
@willcodejavaforfood - 1.Is there a syntax for integer numberic constants and characters? I need to define them as tokens also. Thanks!!
Tom
@Tom - Of course there is, but you have to do this one on your own. Hint is to look at the Character class http://download-llnw.oracle.com/javase/6/docs/api/java/lang/Character.html
willcodejavaforfood