views:

123

answers:

3

Given the following:

  > This is level 1
  > This is level 2
  >> This is level 2.1
  >> This is level 2.2
  >>> This is level 2.2.1
  >>> This is level 2.2.2
  > This is level 3

How would you convert that text to XHTML, without a parser library such as ANTLR? That is:

  <ul>
  <li>This is level 1</li>
  <li>This is level 2
    <ul>
    <li>This is level 2.1</li>
    <li>This is level 2.2
      <ul>
      <li>This is level 2.2.1</li>
      <li>This is level 2.2.2</li>
      </ul>
    </li>
    </ul>
  </li>
  <li>This is level 3</li>
  </ul>

I have tried both recursive and iterative algorithms. The troubling part is closing the ul tags from depth 3 (2.2.2) to depth 1 (3).

Solution

The following code solves the problem. The solution marked as correct was correct when each level represented a single number, rather than a line of text. New lines in the output are for human readability, but since (X)HTML is computer-read, they have been removed from the code below.

public String transform( String source ) {
  // Level 0 means no >, level 1 for one >, etc.
  //
  int currentLevel = 0;
  int nextLevel = 0;

  StringBuilder sb = new StringBuilder( 512 );

  // Split source on newlines.
  //
  String[] lines = source.split( "\\r?\\n" );

  for( String line: lines ) {
    int indents = line.lastIndexOf( ">" );

    if( indents < 0 ) {
      continue;
    }

    String content = line.substring( indents + 1 ).trim();

    nextLevel = indents + 1;

    if( nextLevel == currentLevel ) {
      sb.append( "</li><li>" );
    }
    else if( nextLevel > currentLevel ) {
      sb.append( "<ul><li>" );
    }
    else if( nextLevel < currentLevel ) {
      for( int i = 0; i < currentLevel - nextLevel; i++ ) {
        sb.append( "</li></ul>" );
      }
      sb.append( "</li><li>" );
    }

    sb.append( content );

    currentLevel = nextLevel;
  }

  // Close the remaining levels.
  //
  for( int i = 0; i < currentLevel; i++ ) {
    sb.append( "</li></ul>" );
  }

  return sb.toString();
}
+2  A: 

I would use a simple perl script to program this.

The algorithm is the following: you keep track of the level of nesting on previous line (nprev, 0 at the beginning) and calculate the level of nesting in current line (ncur). You iterate the lines and on each iteration you have three options:

  1. nprev == ncur, then just close </li> tag (you certanly have an opened one here), open <li> for the current line element and print the value on the current line to the output.

  2. nprev < ncur. This means you're in the opened <li> tag (or in the global scope) and the value on the previous line (parent value) is printed. So, you should open <ul> and <li> tags and print value on the current line.

  3. nprev > ncur. Launch a small inner loop that decreases nprev by one until it is equal to ncur. Each time you have to decrease the value, close </li> and </ul> tags. After loop's done, open another <li> tag, print value on current line and proceed the outer loop.

  4. When you iterated all lines, assume there's one bogus line at the end of input, for which ncur is equal to 0. Launch step 3 once more, except for the italicized part. To clarify: if it happens that step 3's condition (nprev > ncur) is not met (that's the case when your input contains no lines), then do nothing.

You're done.

P.S. Parsing and transforming text is a tedious task that becomes fun when you try to make it as succint as possible.

Pavel Shved
A: 

Try this, haven't got time to test it though, but should work. Also a request, I'm a noob, could someone point me to a resource that teaches on how to format answers here, please.

yourFunction() {
    //Split text into lines
    String[] lines = text.split("\n");

    System.out.println("<ul>");
    getHTML(lines, 0, 1);
    System.out.println("</ul>");
}

getHTML(String[] lines, int index, int level) {
    int thisLevel = (lines[index].lastIndexOf(">") + 1);

    if(thisLevel == level) {
        System.out.println("<li>" + lines[index].replaceAll(">", "").trim() + "</li>");
        getHTML(lines, (index + 1), thisLevel);
        return;
    } else if(thisLevel > level) {
        System.out.println("<ul>");
        System.out.println("<li>" + lines[index].replaceAll(">", "").trim() + "</li>");
        getHTML(lines, (index + 1), thisLevel);
        return;
    } else if(thisLevel < level) {
        System.out.println("/<ul>");
        System.out.println("<li>" + lines[index].replaceAll(">", "").trim() + "</li>");
        getHTML(lines, (index + 1), thisLevel);
        return;
    }
}
Chintan
Highlight the code and press control-K. You can use <pre> and <code> tags as well.
Dave Jarvis
`replaceAll(">", "<TAB>")` would make more sense :-). Other than that, a perfect example of recursion where you don't need one. Let alone absence of termination condition and mistakes in increasing recurrent variable index.
Pavel Shved
and again, closing multiple tags doesn't work here.
Pavel Shved
oops!! kinda rushed into the code... didn't notice the multiple tags... thanks Pavel for pointing out the mistake
Chintan
+2  A: 

Here goes a sample implementation based on Pavel's algorithm

class listCreator {

    public String createList(String source) {
        int currentLevel = 0; //Level 0 means beginning, level 1 means a single > was present and so on
        int nextLevel = 0;
        StringBuilder sb = new StringBuilder();
        //Assumes source is to be split on newlines
        String[] tmp = source.split("\n");
        for (String t: tmp) {
            //Needs validation, if source is not what we expect it'll blow up...
            //We are expecting a number of > followed by a space
            String[] levelContent = t.split(" ");
            nextLevel = levelContent[0].lastIndexOf(">") + 1;

            if (nextLevel == currentLevel) {
                sb.append("</li>\n<li>");
                sb.append(levelContent[1]);
            } else if (nextLevel > currentLevel) {
                sb.append("<ul>\n<li>");
                sb.append(levelContent[1]);
            } else if (nextLevel < currentLevel) {
                for (int i = 0; i < currentLevel-nextLevel; i++) {
                    sb.append("</li>\n</ul>\n");
                }
                sb.append("</li>\n<li>");
                sb.append(levelContent[1]);
            }

            currentLevel = nextLevel;
        }
        //Close up remaining levels
        for (int i=0; i < currentLevel; i++) {
            sb.append("</li>\n</ul>\n");
        }
        return sb.toString();
    }

    public static void main(String[] args) {
        String source1 = "> 1\n> 2\n>> 2.1\n>> 2.2\n>>> 2.2.1\n>>> 2.2.2\n> 3\n";
        String source2 = "> 1\n> 2\n>> 2.1\n>> 2.0.1\n>>> 2.0.1.2\n>> 2.2\n>>> 2.2.1\n>>> 2.2.2\n> 3\n";
        listCreator lc = new listCreator();
        System.out.println(lc.createList(source1));
        System.out.println(lc.createList(source2));
    }

}
Vinko Vrsalovic
Cool, thanks. Does it really work? :-O
Pavel Shved
Heh, weren't you sure? It makes sense and it seems to work, given the two test cases.
Vinko Vrsalovic
Thank you, Vinko. Had to make a few minor modifications, but this answer was correct with respect to the original question.
Dave Jarvis