tags:

views:

385

answers:

6

I did this once before and it was very helpful for me. I am an amateur programmer and don't have more seasoned programmers around me to assist and help me grow. Any feedback on this critique is greatly appreciated!

What I've got is a C# class which takes some basic markup from a user (usually from a basic online textbox) and outputs a nicely formatted HTML table. I've included the markup, HTML output, and source code below. Again, all feedback\constructive criticism is greatly appreciated.

I am also curious how much time this would take a seasoned programmer? It took me about 6-6.5 hours.

The Markup

[grid]
border=0, align=center, width=95%
|(c=vdsa) VDS-A|(c=vdsb) VDS-B|(c=vdsc featured) VDS-C
RAM|1GB|2GB|(c=featured) 4GB
CPU|(c=nofeature)|(c=nofeature)
<a href="http://www.google.com"&gt;Google&lt;/a&gt;|(c=yesfeature)|(c=nofeature)|(C=nofeature)
[/grid]

The Output HTML

<table align="center" border="0" width="95%">
    <tr>
            <th></th><th class="vdsa">VDS-A</th><th class="vdsb">VDS-B</th><th class="vdsc featured">VDS-C</th>
    </tr><tr>
            <td>RAM</td><td>1GB</td><td>2GB</td><td class="featured">4GB</td>
    </tr><tr>
            <td>CPU</td><td class="nofeature"></td><td class="nofeature" colspan="2"></td>
    </tr><tr>
            <td><a href="http://www.google.com"&gt;Google&lt;/a&gt;&lt;/td&gt;&lt;td class="yesfeature"></td><td class="nofeature"></td><td class="nofeature"></td>
    </tr>
</table>

The C# Class

using System;
using System.IO;
using System.Linq;
using System.Text;
using System.Web.UI;

namespace GridMarkup
{
    public class GridProcessor
    {
        public string MarkupOpenTag { get; set; }
        public string MarkupCloseTag { get; set; }
        public string[] AllowedTableOptions { get; set; }

        private string[] _rowSplitter = { "\r\n" };
        private char _columnSplitter = '|';
        private string _optionAssignmentOperator = "=";
        private char[] _optionEndOperators = { ',', '\r', '\n', ' ' };
        private int _columnsInGrid = -1;
        private string _cssAssignmentOperator = "(c=";
        private string _cssEndOperator = ")";

        public GridProcessor() {
            MarkupOpenTag = "[grid]";
            MarkupCloseTag = "[/grid]";
            AllowedTableOptions = new string[] { "align", "border", "cellpadding", "cellspacing", "width" };
        }

        public string Transform(string text)
        {
            if (string.IsNullOrEmpty(text)) return "";

            string output = string.Empty;
            int currentCurserLocation = 0;

            while (currentCurserLocation < text.Length)
            {
                // try to find an opening grid tag.
                int gridStart = text.ToLower().IndexOf(MarkupOpenTag.ToLower(), currentCurserLocation);
                if (gridStart > -1)
                {
                    // we have our opening tag, try to find an end tag.
                    int gridEnd = text.ToLower().IndexOf(MarkupCloseTag.ToLower(), currentCurserLocation);
                    if (gridEnd > -1)
                    {
                        // we have ourselves a grid! check for data between grids.
                        if (gridStart > currentCurserLocation)
                        {
                            output += text.Substring(currentCurserLocation, gridStart - currentCurserLocation);
                        }

                        // let's get the data out of the grid.
                        string gridText = text.Substring(gridStart + MarkupOpenTag.Length, gridEnd - gridStart - MarkupOpenTag.Length);
                        output += ProcessGrid(gridText);

                        // we are done with this grid, move the curser forward.
                        currentCurserLocation = gridEnd + MarkupCloseTag.Length;
                    }
                    else
                    {
                        output += text.Substring(currentCurserLocation, text.Length - currentCurserLocation);
                        currentCurserLocation = text.Length;
                    }
                }
                else
                {
                    output += text.Substring(currentCurserLocation, text.Length - currentCurserLocation);
                    currentCurserLocation = text.Length;
                }
            }

            return output;
        }

        private string ProcessGrid(string gridText)
        {
            StringBuilder sb = new StringBuilder();
            string[] gridRows = gridText.Split(_rowSplitter, StringSplitOptions.RemoveEmptyEntries);

            using (HtmlTextWriter writer = new HtmlTextWriter(new StringWriter(sb)))
            {

                //for the first line, check for table options
                int startIndex = 0;
                bool optionsFound = false;

                foreach (string option in AllowedTableOptions)
                {
                    int optionStart = gridRows[0].ToLower().IndexOf(option.ToLower() + _optionAssignmentOperator);
                    if (optionStart > -1)
                    {
                        int optionEnd = gridRows[0].ToLower().IndexOfAny(_optionEndOperators, optionStart);
                        if (optionEnd == -1)
                            optionEnd = gridRows[0].Length;

                        writer.AddAttribute(option, gridRows[0].Substring(optionStart + option.Length + 1, optionEnd - optionStart - option.Length - 1));
                        optionsFound = true;
                    }
                }

                if (optionsFound)
                    startIndex += 1;

                writer.RenderBeginTag(HtmlTextWriterTag.Table);

                for (int i = startIndex; i < gridRows.Count(); i++)
                {
                    ProcessGridRow(gridRows[i], writer, (i == startIndex) ? HtmlTextWriterTag.Th : HtmlTextWriterTag.Td);
                }

                writer.RenderEndTag(); // end table

            }

            return sb.ToString();
        }

        private void ProcessGridRow(string rowText, HtmlTextWriter writer, HtmlTextWriterTag dataTag)
        {
            string[] rowColumns = rowText.Split(_columnSplitter);
            int rowColumnsCount = rowColumns.Count();
            int colSpan = 0;

            // this is our header, so it defines the number of columns
            if (dataTag == HtmlTextWriterTag.Th)
                _columnsInGrid = rowColumnsCount;

            writer.RenderBeginTag(HtmlTextWriterTag.Tr);
            for (int i = 0; i < rowColumnsCount; i++)
            {
                // if this is the last column, check to see if we need to set a colspan
                if (i == rowColumnsCount - 1)
                    colSpan = _columnsInGrid - rowColumnsCount + 1;

                ProcessGridColumn(rowColumns[i], writer, dataTag, colSpan);
            }
            writer.RenderEndTag(); // end tr
        }

        private void ProcessGridColumn(string columnText, HtmlTextWriter writer, HtmlTextWriterTag dataTag, int colSpan)
        {
            string innerColumnText = columnText;

            // check to see if there are any css class assignments
            int cssStart = columnText.ToLower().IndexOf(_cssAssignmentOperator.ToLower());
            if (cssStart > -1)
            {
                // we found the start of a css assignment, lets see if we can find the end
                int cssEnd = columnText.IndexOf(_cssEndOperator, cssStart + _cssAssignmentOperator.Length);
                if (cssEnd > -1)
                {
                    // we've got the end, now lets add the css attribute.
                    string cssClasses = columnText.Substring(cssStart + _cssAssignmentOperator.Length, cssEnd - cssStart - _cssAssignmentOperator.Length);
                    writer.AddAttribute(HtmlTextWriterAttribute.Class, cssClasses);
                    innerColumnText = columnText.Remove(cssStart, _cssAssignmentOperator.Length + _cssEndOperator.Length + cssClasses.Length).Trim();
                }
            }

            if (colSpan > 1)
                writer.AddAttribute(HtmlTextWriterAttribute.Colspan, colSpan.ToString());

            writer.RenderBeginTag(dataTag);
            writer.Write(innerColumnText);
            writer.RenderEndTag();
        }
    }
}
+2  A: 

You have quite peculiar kind of parser. As your definition grammar seems to me a LL(1), you could do the same with a simpler stream parser.


LL(1) grammars are simple to parse, starting from BNF for your grammar. You say:

definition :== griddef { griddef }
griddef :== gridstart gridbody gridend
gridstart :== "[grid]" gridattributes
gridattributes :== { gridattribute }
gridattribute :== borderdef | aligndef | widthdef
...

This gives you the following straightforward code:

bool ParseDefinition()
{
    if (!ParseGridDef())
        return false;
    while(ParseGridDef())
        ;
    return true;
}

bool ParseGridDef()
{
    return ParseGridStart() && ParseGridBody() && ParseGridEnd();
}

bool ParseGridStart()
{
    if (!AssureRead("[grid]"))
        return false;
    while (ReadGridAttribute())
        ;
    return true;
}

bool ReadGridAttribute()
{
    return ReadBorderDef() || ReadAlignDef() || ReadWidthDef();
}

etc.

Vlad
not so that I am a seasoned programmer though
Vlad
I don't follow what you are saying.
Jeremy H
2Jeremy: your parser first looks at the beginning, find a beginning of a grid definition, then finds the end of grid definition, then returns to the beginning and starts to _parse_ the found definition. And so on. This would make the code fail if you ever decide to have your definition language recursive.
Vlad
A: 

you might want to look into formal languages and parsers. "recursive decent" parsers for LL(1) grammars are very easy to code by hand. you might also want to look into parser generators like ANTLR. they allow you to describe your input with a grammar and you can use semantic actions to (for example) convert the input into something else (html tables in your case).

stmax
A: 

Your code is also following the design of a recursive descent parser, but not quite. I would suggest this pattern as it works well and is well understood.

In some ways your code is close, a rdp would end up with parseGrid(), parseRow(), and parseColumn() functions much like you have here.

Hogan
+1  A: 

I'm sure it's very hard to find two programmers who agree on what's the best method of parsing this sort of stuff, but as far as I can tell, this code is so well done that I honestly find it hard to believe that you're an amateur.

Furthermore, parsing is a science of its own, so I'm sure there are many better ways of doing this, but probably none without delving into the entire field of parsing and compilation.

The commenting is appropriate, indenting is perfect and the style is consistent throughout. Doesn't get much better than that when parsing goo like that.

One criticism I can provide is the concatenation of the string variable 'output'. It is a monstrous overhead to add a lot of text to a string variable. Rather, you should have a StringBuilder. It's very simple and much, much faster.

StringBuilder builder = new StringBuilder();
builder.Append("whatever");
return builder.ToString();

That's all I can see wrong with it, frankly. Very well done. :)

As for how much time it would have taken me... well, if there's one thing that plagues the software industry it's unrealistic deadlines, so while I'd be tempted to say 1.5 hours assuming everything goes perfect, my mind is on fire and there are no unexpected complications, but the reality is that it would probably be closer to 3-4. That's a guesstimate like all time estimates in this industry. The nobel prize will go to the one who figures out how to accurately estimate coding time. ;)

Helgi Hrafn Gunnarsson
Really? This code is so well done? Looks more like BS degree work to me not someone who has made production code or has any professional experience.
Hogan
@Helgi Thanks for the info on using the StringBuilder. I've already updated my code locally.
Jeremy H
Hogan: How about someone who has made production code for 10 years professionally in multiple languages on multiple platforms? Would that count as 'any professional experience'? Would that count as someone who has made production code?Spare me the wiseguy attitude, please. Don't fool yourself... there are a million different opinions on a million different methods to do any parsing of any kind.I think anyone "who has made production code or has any professional experience" will agree that this code is exceptionally well written from an amateur.
Helgi Hrafn Gunnarsson
+6  A: 

I can see a few things that could use improvement:

  • Non-specific class name; GridProcessor doesn't really tell me what the class does. What it actually does is transform one type of markup into another. Assuming the markup style has a name (let's call it MyMarkup), then a better class name would be MyMarkupTransformer or even MyMarkupHtmlTransformer.

  • Array fields are named after singular nouns; should be plural.

  • Private data that appears to never change, like _columnSplitter, is not declared const or readonly (it should be).

  • MarkupOpenTag and MarkupCloseTag probably have to match, in which case this is a (minor) DRY violation, the tag itself should be declared once and the tag formats declared separately (or not at all, if they can't really change).

  • String concatenation using +=. Use a StringBuilder instead.

  • Cursor misspelled (hey, I'm just being thorough).

  • ToLower().IndexOf() can fail to produce the desired result (i18n; n.b. Turkish i). Better to use the IndexOf overload that takes a StringComparison.

  • Comments are explaining what the code does, not why. Worse, they're written in a conversational style. They are noise. Remove them.

  • Some fairly deep if nesting could be refactored into subroutines.

  • I notice you use HtmlTextWriterTag for tags, but not HtmlTextWriterAttribute for attributes. I'm personally not a fan of those classes, but if you're going to use one, you should be consistent and use both instead of hard-coded strings.

  • Underscores in member fields - contentious issue for some folks here, but nevertheless they go against Microsoft's coding guidelines, and StyleCop will complain.

This is all mostly surface/style stuff, I haven't tried to do an in-depth analysis of the algorithm used, but aside from a few issues in the list above, it looks to be OK. Meaningful variable names, IDisposable instances wrapped in using, all good. No serious problems, it's better than the code I've seen some "professionals" write.

The type of parser may be overkill for what you're trying to do with it, but I see that someone else has already commented on that, so I won't go into more detail here.

Aaronaught
Thank you @Aaronaught!
Jeremy H
+7  A: 

Couple things.

  • Returning arrays is a bad programming practice. Remember, arrays are mutable. If you hand out an array then the caller is free to change the contents. Consider returning a ReadOnlyCollection wrapper around the array instead.

  • Others have pointed out that you've basically got a recursive descent parser here. I note that you are conflating the "understanding the structure" and the "output the results" code. I would be inclined to do this in two completely separate stages. First write a parser that takes text and turns it into a tree of objects that represent the analyzed text. Then write a method that takes a parse tree and produces a transformed tree. Then turn the transformed tree into text.

That is more flexible, and it has the nice property that you can change the language you parse and the language you produce independently.

Eric Lippert
What's returning an array? All I see is the `AllowedTableOptions` property which is (apparently) intended to be mutable. Maybe an `IList<string>` without a setter would be a safer choice?
Aaronaught
@Aaronaught: I haven't read the code very carefully; it's not immediately clear to me whether the options array is input, output or both. But regardless, it is a bad idea to be holding on to an array; anyone with access to that instance can change it at any time. It's a big lump of mutable state just asking for trouble.
Eric Lippert
Believe me, I agree that an array isn't something you should use to expose immutable data, and I would never return one from a method. But this data isn't semantically immutable; the only alternative to a mutable collection would seem to be a bunch of painful `AddOption`/`RemoveOption`/'IndexOfOption' methods, i.e. re-implementing everything that an array/`IList` already does.
Aaronaught