views:

38

answers:

1

Using java, I am writting a script to anchor link an html bibliography. That is going from: [1,2] to: <a href="o100701.html#bib1">[1, 2]</a>

I think I have found the right regex expression: \[.*?\]

What I am having trouble with is writting the code that will retain the values inside the expression while surounding it with the link tags.

This is the most of I can think of

while(myScanner.hasNext())
{
 line = myScanner.nextLine();
 myMatcher = myPattern.matcher(line);
 ...
 outputBufferedWritter.write(line+"\n");
}

The files aren't very large and there almost always less then 100 matches, so I don't care about performance.

+2  A: 

First of all I think a better pattern to match the [tag] content is [\[\]]* instead of .*? (i.e. anything but opening and closing brackets).

For the replacement, if the URL varies depending on the [tag] content, then you need an explicit Matcher.find() loop combined with appendReplacement/Tail.

Here's an example that sets up a Map<String,String> of the URLs and a Matcher.find() loop for the replacement:

    Map<String,String> hrefs = new HashMap<String,String>();
    hrefs.put("[1,2]", "one-two");
    hrefs.put("[3,4]", "three-four");
    hrefs.put("[5,6]", "five-six");

    String text = "p [1,2] \nq [3,4] \nr [5,6] \ns";

    Matcher m = Pattern.compile("\\[[^\\[\\]]*\\]").matcher(text);
    StringBuffer sb = new StringBuffer();
    while (m.find()) {
        String section = m.group(0);
        String url = String.format("<a href='%s'>%s</a>",
            hrefs.get(section),
            section
        );
        m.appendReplacement(sb, url);
    }
    m.appendTail(sb);

    System.out.println(sb.toString());

This prints:

p <a href='one-two'>[1,2]</a> 
q <a href='three-four'>[3,4]</a> 
r <a href='five-six'>[5,6]</a> 
s

Note that appendReplacement/Tail do not have StringBuilder overload, so StringBuffer must be used.

References

Related questions

polygenelubricants