Disclaimer: This is a personal project that I'm doing for fun. I'm not looking to use existing libraries since it would take some of the joy out of learning more about wheels.
That being said, I'm working on a web spider and I've come to the problem of how to represent HTML form elements with a single object.
What I want to do is have an "HTML Document" object, which contains an array of all form elements as one of its properties. The problem is that I can't figure out a way to represent <input />
tags, as well as <select />
tags, since select tags can have multiple child <option />
tags.
Is there any good way to represent both <input />
tags, which store basically only name/value pairs, and <select />
tags which have an array of name/value pairs in the same class?
The best idea I've come up with so far is to treat the <option />
tags of a <select />
tag as individual form fields, similar to how I would represent <input type="radio" />
or <input type="checkbox" />
.
So I would have this:
class FormField {
public string Name { get; set; }
public string Value { get; set; }
public string Type { get; set; }
}
And then a collection class for iterating would:
- The collection class would be an "array of arrays". The outer array would have a single inner array for each name in the HTML document.
- Its indexer could get fields by Name. This index would return an array of
FormField
objects. - When enumerating over the entire document's form fields, each iteration would have an array of
FormField
objects, since it would be an array of arrays.
Is this the best solution, or is there a simpler way to represent this?