Let's face it: writing proper, standards compliant HTML is quite difficult to do. Writing semantic HTML is even more so, but I don't think it's possible for a computer to figure that out.
So my question to you is what would the "ideal" feedback for a user who entered HTML be? Would it be a W3C validator style list of errors and corresponding line numbers and columns? Would it be a annotated code display of highlighted lines, explanations of the errors, and possible fixes? A spell-check style mode where you handle each error separately? Would it be not giving them any error information at all? Also, what types of errors are a good idea to tell users? (Some broad classes of errors include parsing errors, nesting errors (i.e. putting a div in a b tag) and well-formedness errors.)
- Scottm: Good point; I've never liked the W3C way of listing all the errors either. However, there is still the question of then letting the user edit the offending HTML appropriately.
- onebyone: Ok, so looking at some screenshots it looks like HTML Validator has a W3C error list, but combined with the ability to go straight to the relevant source segment and expanded error information, as well as the fact that you don't have to go scrolly to jump from one section to another. Looks pretty good, but is it usable by the average Joe?
Edit 1: As a clarification, this is with regards to the interface, not necessarily the underlying implementation. However, interface needs to be feasible with plain HTML and JavaScript (double usability points if it just needs HTML, but I think you're going to get stuck with W3C in that case).