tags:

views:

82

answers:

4

Hello guys,

I'm trying to establish some way of mapping a String document to a HashMap as follows:

the String contains key/value pair

$key1=value1
$key2=value2 value21
value22
$key3=value3

what I want to end up with is:

key1, value1
key2, value2 value21\nvalue22
key3, value3

Is there a pattern I can use for this? It looks like an interesting puzzle, so far I have come up with using split("[$]{1}[A-Za-z]+[=]{1}") to separate the different values but then it has to be a different iteration to identify the keys, so I'm looking for a more elegant solution.

Thanks for your time.

+1  A: 

Probably you may take a look to the Properties files.

You can use the Properties.load(InputStream) to read entries from files using:

Properties properties = new Properties();
properties.load(this.getClass().getClassLoader()
     .getResourceAsStream("file.properties"));

or even from a String using:

String myString = new String("first=1");
Properties properties = new Properties();
properties.load(new StringBufferInputStream(myString));

You can find more information regarding formatting in the Properties.load(InputStream) doc.

Alexander
A: 

Assuming, the complete text is in a string (= not read line by line), then you can work with multiple splits:

String input = "$key1=value1\n$key2=value2 value21\nvalue22\n$key3=value3";
String[] lines = input.split("\n");
Map<String, String> map = new HashMap<String, String>();


StringBuilder value = null;
String key = null;
for (String line:lines) {
  if (line.contains('=')) {
    if (key != null) {
      // store the actual k/v pair before starting a new one
      map.put(key, value.toString());
    } else {
      // create a new k/v pair
      String[] temp = line.split("=");
      key = temp[0];
      value = new StringBuilder(temp[1]);
    }
  } else {
    // we have a value that belongs to the last k/v pair
    value.append("\n").append(line);
}
map.put(key, value.toString());

The solution might need some fine-tuning but should work in principle.

Andreas_D
This is pretty much what I have at the moment and it _doesn't_ work. In the above example keyValuePairs[2] does not contain a $key=In any case, I appreciate your reply.
heeboir
... sorry, my mistake. I'll find the correct solution, hold on, please.
Andreas_D
... that's better. We store a k/v pair after we're sure, we've read all chars for the value. A new line containing a key triggers the storing of the previous one.
Andreas_D
A: 

You must use two regex here:
\$(\w+)=((\w+\s*)+) will separate keys from values
(\s+) to split values.

String input = "$key1=value1\n" +
        "$key2=value2 value21\n" +
        "value22\n" +
        "$key3=value3";

Pattern keyValuePattern = Pattern.compile("\\$(\\w+)=((\\w+\\s*)+)");
Matcher keyValueMatcher = keyValuePattern.matcher(input); // get a matcher object
Map<String, List<String>> map = new HashMap<String, List<String>>();
while (keyValueMatcher.find()) {
    String key = keyValueMatcher.group(1);
    List<String> values;
    values = Arrays.asList(keyValueMatcher.group(2).split("\\s+"));
    //If you want to update your lists later comment the line above and uncomment those two
    //values = new ArrayList<String>();
    //values.addAll(Arrays.asList(keyValueMatcher.group(2).split("\\s+")));

    map.put(key, values);
}

System.out.println(map); // {key3=[value3], key2=[value2, value21, value22], key1=[value1]}

NB: You could use \$(\w+)=(.*) as regex too, it depends on what you want to match, in the case above, every word/number separated by spaces, in this case, anything.

Colin Hebert
A: 

You need to be clear exactly what your boundaries and delimiters are. In the case of your example, I'm guessing the delimiters are the $ at the start of the line which is followed by a "word" (composed of \w characters) then an = symbol.

If we make a simplifying assumption and say that a $ at the start of the line is the delimiter (regardless of what follows) then we can do something like this:

(?xms:    # Switch on "x" (comment-mode), "m" (Allow "^" to mean start of line rather than start-of-input), "s" (dot-all)
   ^ \$   # Match $ at start of line only
   ( [^=]+ )   # Then capture everything until the next '=' (you may want to use \w+ here)
   =           # Skip the =
   (
     (?: 
        (?! \n (?: ^ \$ | \z ) )  # Stop if we reach the \n before the next $ at the start of a line or the \n at the end of the input
        .           # Otherwise accept any character (including \n thanks to dot-all mode)
     )  *
   )    # This will capture everything (excluding the trailing newline) up to the next $ at the start of the line (or all the way to the end of the input)
)

You can then use this in a loop:

static final Pattern pattern = Pattern.compile( " <the above RE> ");
  . . .
Matcher matcher = pattern.matcher(myInputString);
while (matcher.find()) {
    map.put(matcher.group(1), matcher.group(2));
}

(I haven't tested any of this at all but it should get you started)

Adrian Pronk