views:

313

answers:

6

I have the following string which will probably contain ~100 entries:

String foo = "{k1=v1,k2=v2,...}"

and am looking to write the following function:

String getValue(String key){
    // return the value associated with this key
}

I would like to do this without using any parsing library. Any ideas for something speedy?

+3  A: 

If you know your string will always look like this, try something like:

HashMap map = new HashMap();

public void parse(String foo) {
  String foo2 = foo.substring(1, foo.length() - 1);  // hack off braces
  StringTokenizer st = new StringTokenizer(foo2, ",");
  while (st.hasMoreTokens()) {
    String thisToken = st.nextToken();
    StringTokenizer st2 = new StringTokenizer(thisToken, "=");

    map.put(st2.nextToken(), st2.nextToken());
  }
}

String getValue(String key) {
  return map.get(key).toString();
}

Warning: I didn't actually try this; there might be minor syntax errors but the logic should be sound. Note that I also did exactly zero error checking, so you might want to make what I did more robust.

Tenner
A shortcut would be using `",={}" `. No hacking off braces or a second tokenizer needed :)
rsp
@rsp: Good point!
Tenner
+2  A: 

The speediest, but ugliest answer I can think of is parsing it character by character using a state machine. It's very fast, but very specific and quite complex. The way I see it, you could have several states:

  • Parsing Key
  • Parsing Value
  • Ready

Example:

int length = foo.length();
int state = READY;
for (int i=0; i<length; ++i) {
   switch (state) {
      case READY:
        //Skip commas and brackets
        //Transition to the KEY state if you find a letter
        break;
      case KEY:
        //Read until you hit a = then transition to the value state
        //append each letter to a StringBuilder and track the name
        //Store the name when you transition to the value state
        break;
      case VALUE:
        //Read until you hit a , then transition to the ready state
        //Remember to save the built-key and built-value somewhere
        break;
   }
}

In addition, you can implement this a lot faster using StringTokenizers (which are fast) or Regexs (which are slower). But overall, individual character parsing is most likely the fastest way.

Malaxeur
For raw speed, use the char array to avoid synchronization. Well, that's an old-timer reflex since modern JVMs coarsen the locks :-)
cadrian
Oh, good call. I actually completely forgot to drop in how to actually access the characters...
Malaxeur
A: 

Written without testing:

String result = null;
int i = foo.indexOf(key+"=");
if (i != -1 && (foo.charAt(i-1) == '{' || foo.charAt(i-1) == ',')) {
    int j = foo.indexOf(',', i);
    if (j == -1) j = foo.length() - 1;
    result = foo.substring(i+key.length()+1, j);
}
return result;

Yes, it's ugly :-)

cadrian
A: 

Well, assuming no '=' nor ',' in values, the simplest (and shabby) method is:

int start = foo.indexOf(key+'=') + key.length() + 1;
int end =  foo.indexOf(',',i) - 1;
if (end==-1) end = foo.indexOf('}',i) - 1;
return (start<end)?foo.substring(start,end):null;

Yeah, not recommended :)

Sinuhe
Don't think i'll be using this one, but interesting answer!
yankee2905
Oh, I know is not the good way :) I just wanted to indicate that this is a fast method. But some users are faster than me and posted similar solutions before. I don't see good solutions in the other answers too, and the final solution would imply using an AST parser or something similar.
Sinuhe
+1  A: 

If the string has many entries you might be better off parsing manually without a StringTokenizer to save some memory (in case you have to parse thousands of these strings, it's worth the extra code):


public static Map parse(String s) {
    HashMap map = new HashMap();
    s = s.substring(1, s.length() - 1).trim(); //get rid of the brackets
    int kpos = 0; //the starting position of the key
    int eqpos = s.indexOf('='); //the position of the key/value separator
    boolean more = eqpos > 0;
    while (more) {
     int cmpos = s.indexOf(',', eqpos + 1); //position of the entry separator
     String key = s.substring(kpos, eqpos).trim();
     if (cmpos > 0) {
      map.put(key, s.substring(eqpos + 1, cmpos).trim());
      eqpos = s.indexOf('=', cmpos + 1);
      more = eqpos > 0;
      if (more) {
       kpos = cmpos + 1;
      }
     } else {
      map.put(key, s.substring(eqpos + 1).trim());
      more = false;
     }
    }
    return map;
}

I tested this code with these strings and it works fine:

{k1=v1}

{k1=v1, k2 = v2, k3= v3,k4 =v4}

{k1= v1,}

Chochos
A: 

Adding code to check for existance of key in foo is left as exercise to the reader :-)

String foo = "{k1=v1,k2=v2,...}";

String getValue(String key){
    int offset = foo.indexOf(key+'=') + key.length() + 1;
    return foo.substring(foo.indexOf('=', offset)+1,foo.indexOf(',', offset));
}
rsp