views:

390

answers:

7

From reading the mailing lists and looking at the specification I cannot tell what the limits of HTML5 are as a software or programmatic technology. I have seen where they have attempted to standardize video and audio formats in HTML5 and it seems they may be writing the definitions for XHTML5 into the HTML5 specification. It also appears the specification is extremely lengthy and covers topics far outside the mere definitions and minimally required processing instructions of a markup language.

With version 5 is HTML now an application interface opposed to just a markup language? If so then what are the boundaries and defined limits of the technology? If not, then why are so many topics irrelevant to the processing of markup taking such a spotlight in the development process of the technology? When do the boundaries of a markup language end and the application preferences of a user-agent application begin? With HTML5 that separation does not appear very clear, but as an industry standard it should be crystal clear, right?

A: 

You may find this article very interesting: X/HTML 5 Versus XHTML 2 http://xhtml.com/en/future/x-html-5-versus-xhtml-2/

Since W3C is slow in getting an updated spec and the web is not only fragmenting more, but there are needs that can be met that aren't really possible due to the specs being so old, HTML5 is working toward fixing these, such as the canvas tag and having embedded video/audio. This replaces the very overused <object> tag and Flash having to be used instead.

The web has gone beyond just serving up web pages, now we have javascript applications, so now we can have more interactivity than was really possible before due to some of the changes not only from HTML5 but the movement toward a newer version of JavaScript.

So, HTML5 should be more than just a markup language as the web applications have gone beyond servers just serving static pages, which is what a markup language was good for.

James Black
I completely understand your point of view and you did provide a very good answer, but you did not answer the question. What are the scope boundaries of HTML5 and where does HTML5 end and the user-agent begin? Your answer really just reiterated what I believed about HTML5 in the first place.
I don't believe there is a scope boundary beyond what they feel they can add, as they are trying to predict what may be needed, since it won't be fully in browsers and common for about 10 years.
James Black
The browser will need to decide how to handle what has precedence. If I want to use the Quicktime application for some movies but the rest are using the embedded, that is not something the spec can determine, the browsers will need to deal with that.
James Black
From your answer should I then assume the drafters of HTML5 have a complete inability to separate HTML, a markup language technology, from the web, a model of discourse at the application layer of the internet?
I would expect that if they try to do that, they will cause too many problems with existing pages. It is similar to the problem with updating JavaScript, they have to be cognizant of what is out there already. Also, what benefit would be to separate HTML from the application part, such as <canvas> and audio/video?
James Black
I disagree. There is no reason to keep repeating mistakes when we are perfectly aware of their failures. A simple solution is to create a completely new and separate technology, rally around the technology, standardize it, and then deprecate the former.
In a perfect world that would be good, but the web is worse off then MS, in that some backward comparability is extremely important, otherwise the specification will create far more problems than it solves, since it will take a long time for people to upgrade to a new browser, so everything would have to be written in two completely different formats.
James Black
@austin: "create a completely new and separate technology" - that no-one uses. See XHTML2.
Nickolay
@Nickolay - I had experimented with XHTML2, but I found the lack of libraries to be a pain, as I would have to write my own wysiwig editor, for example.
James Black
+3  A: 

You're not the first person wondering about this. See the discussion between Rob Sayre and the HTML5 editor (hixie): http://blog.mozilla.com/rob-sayre/2008/02/19/bloaty-parts-of-the-whatwg-html5-specification-that-should-be-removed/#comment-7559

My understanding is this: there is a number of

  1. widely implemented, but underspecified or not specified old technologies (e.g. "DOM 0" features, tag-soup parsing)
  2. "important" new technologies, which the modern browser vendors would like to implement interoperably (e.g. video, canvas, offline).

If hixie is interested in them and no other editor steps up to maintain a separate specification, hixie prefers to keep them in HTML5, "[rephrased] paying the price of a bloated specification for not stalling the web progress".

BTW, if you want an authoritative answer, you should ask hixie himself or in the HTML5 discussion forums.

[edit] found an addition e-mail from hixie on splitting stuff from the HTML5 spec: http://lists.w3.org/Archives/Public/public-html/2008Oct/0127.html

Nickolay
If this is about a community standard then why is a single person expected to be the source of all answers for the definition of what the specification's technology is? Why is this answer not explicitly defined in the standard? It sounds as though the specification does not know what technology it is a specification for. Am I completely off the mark when I say that?
You're asking meta-questions about other people's work and I don't understand why. Yes,ideally there wouldn't be braindead behavior already implemented and relied upon, content and presentation would be 100% separate. There would be clear line between what's in HTML5 and what's out. There would be 100 experts on web technologies who have time to devote to writing high-quality specifications and the specifications would be implemented without errors and deployed to the users in 3 months. However we don't live in an ideal world and I don't a problem with the current setup that is easily fixable.
Nickolay
The work is a standard and is intended to apply to many aspects of the web, which directly impacts everybody who develops for the web. It is not personal work and it is why I am asking. Everything in your hypothetical, except for your timeline, is generally considered a minimal expectation in the drafting of spec supported by standards bodies. There were 150 "experts" that came together to write XML. Consider how standards come together for ISO or IETF. In those groups it is like submitting a doctoral dissertation with extended defense by the author. Why must HTML have lower expectations?
So you'd rather have all work on web technologies evolution stopped until everything is split across multiple specifications? The W3C already tried that (along with trying the "replace HTML, it's broken" route), which resulted in browser makers collaborating on a spec in WHATWG instead. I don't know the motives, but after it became clear that what's currently in HTML5 gets specified and implemented even without W3C interested, a W3C working group was created to continue working on this specification [http://www.w3.org/TR/html5/introduction.html#history-1].
Nickolay
So basically: pragmatism beats idealism with its "high expectations". Sorry if it bothers you.
Nickolay
Web technology evolution is a result of innovation from the user agents. Netscape invented JavaScript. Microsoft invented XMLHttpRequest. Opera invented tabs. How is allowing user agents the freedom to innovate killing the evolution of the web? This is how the web has evolved so far. User-agents are free to or not to adopt any technology they wish regardless of standardization, so I think your understanding of the WHATWG and W3C relationship is severely flawed. If HTML5 is developed without a specified set of requirements and its scope of coverage is unknown what pramatism are you speaking of?
Thing is, the browser vendors (except IE) preferred to collaborate on a spec in the WHATWG. If you're wondering why the specification was adopted by the W3C without "cleaning it up", I don't know. But the spec itself *is* the way the browser vendors wanted the evolution of the web to happen.
Nickolay
+1  A: 

With version 5 is HTML now an application interface opposed to just a markup language?

Yes.

If so then what are the boundaries and defined limits of the technology?

Mostly a self-imposed rule of not taking any major new features anymore.

When do the boundaries of a markup language end and the application preferences of a user-agent application begin?

It's blurry. Is this Stack Overflow page a document or an application?

With HTML5 that separation does not appear very clear, but as an industry standard it should be crystal clear, right?

The spec is clear in its operational requirements. It doesn't need to be clear in defining a distinction between documents and applications.

hsivonen
The HTML and contained content are a document. The JavaScript and AJAX are an application interface. The actual application is the user-agent software that interprets the JS and parses the HTML. Distinction of roles make the separation of technologies quite clear. The specification does need to be clear in defining differences between documents and applications in order to be clear in its intentions. The specification is obviously not so clear or I would have not have to ask this question or recieve such obfuscated answers.
A: 

HTML, from the very beginning, has had this tension between markup and behavior (cf. Why do we have an IMG element?). HTML and the web are inexorably linked already. The HTML specs waver between technical purity and paving the cowpaths.

HTML is a markup language, but the behavior of application implementing the spec is constrained. For more pure markup, XML or SGML would be more appropriate.

As I understand it, you are asking why the spec isn't limited to the markup portion (x/HTML 5) and instead also specifies user-agent behavior, is that correct? If so, I believe it is because the spec does cover user-agent behavior, intentionally so. It specifies how the implementing application should behave in order to be in adherence to the specification.

If you were starting from scratch today, you would not end up with HTML5. However, we're not starting from scratch and the HTML specs have always tried to balance the real world with the ideal.

Don
Markup is the vocabulary and structural constraints upon that vocabulary. Behavior has always been an issue of form controls and JavaScript. I do not see the confusion. The W3C has even been very clear about this on the XML side with separate standards for XML events and XForms. I do not the confusion that you claim has been so apparent for so long.
I see now what you are getting at. But as you point out, with XML there is separation. HTML has been less pure; hence the conflict between the W3C and WHATWG groups when spec'ing out XHTML2 and HTML5. I know you'd disagree, but that HTML has been creeping toward markup/behavior for some time. Use XML for clearer separations.
Don
Are you saying that with regards to developing for the web I should be content with confusion because the specification is cluttered?
+1  A: 

At the risk of sounding like an oversimplification: if it's in the spec, it's part of the standard. In order to be compliant, an agent will have to implement the specified portions.

The fact that it's not "just a markup language" is not a new thing with HTML 5. HTML specifications were always a little bit more than simply document markup. From what I can tell, the efforts to refine HTML into a markup-only definition reached their pinnacle with XHTML.

HTML 5 seems to be an acknowledgement that pure markup alone doesn't really go far enough towards addressing certain real-world concerns, and an updated standard could help to resolve those issues: "But what should happen in this situation?" "Oh well, that's up to the user agent, we don't worry about that in our markup spec." ... Not a very satisfactory solution in a web where end-user experience suffers because of a lack of consensus on just such issues.

Is it an API? perhaps, but as a language it will still work as mere markup when needed (think of non-graphical user agents). In some cases, it should work better than the available choices.

To answer your last question: no, in a standard, the separation between markup language and behaviour of the user agent does not need to be "crystal clear". What made you think it did? But I suspect it is clearer than you think: can you give an example of a part of the spec where you are not sure if it is referring to markup or user agent behaviour?

Zac Thompson
What real world concerns are not solved by constraining the HTML specification to the markup language? If there are additional concerns why are they not addressed in separate application related specifications? Even so, what are the definitions and scope of the HTML5 specification, because it looks like they threw a little bit of everything into the spec completely without regard for any sort of original or intended objective for the language as a whole. If the spec is not clear on a separation between markup and application how are the developers to determine clear best practice decisions?
Real-world-concern: how the agent should behave when confronted with a document that contains errors. Why isn't it in separate specifications? Good idea: you should do that. Oh, you don't have time? Yeah. That's why.
Zac Thompson
How would a clearer separation help make "best practice decisions"? definitions here: http://www.w3.org/TR/html5/infrastructure.html#terminology , scope here: http://www.w3.org/TR/html5/introduction.html#scope . What decisions are you trying to make in your development that you can't make because of an alleged lack of adequate separation? Your remarks increasingly sound like noise. A concrete example would go a long way towards making this discussion actually *about* something.
Zac Thompson
If the user-agent encounters errors why must there be a specification for that at all? Why can't the user-agent be allowed to make its own decisions, especially regarding violations to other specifications? Why must all user-agents be exactly the same for all possible aspects all the time? That sounds like death of innovation to conform to application concerns not related to the core language when there are already other specifications relevant to behavior such as DOM and ECMAScript.
I am currently writing a language in XML Schema that I would like to be down compatible to HTML. Now instead of compatibility to vocabulary and structure of that vocabulary I now have to worry about required default media types, session behaviors that should be indepenent to the user agent. I have to worry about differences to the DOM that don't exist in other W3C standards, dispite DOM being a separate standard. I have to worry that there are unique interfaces to ECMAScript that do not exist in other W3C standards. In my office we have a name for such violations of reuse called a "one-off".
I'm not sure that I understand why things like required default media types are relevant to your language, unless there is some kind of conflict? Alternatively, you could try making your language "down compatible" with XHTML 1.1 instead of trying to target HTML 5.
Zac Thompson
"Why can't the user-agent be allowed to make its own decisions, especially regarding violations to other specifications? Why must all user-agents be exactly the same for all possible aspects all the time?" -- Because if they aren't then life will return to the 1980-2000 era where the developers making things lives were hell (trying to make everything work in every browser under different formats...javascript still suffers from this in many ways), OR they simply supported only the browser they wanted to. No thanks.
Kevin Peno
A: 

Here is the link stating the differences between HTML5 and HTML4. A lot of attributes and tags were removed from HTML5 considering that they were better handled by CSS. What if the programmer easiness if he is not mastered the CSS?

Sachin Chourasiya
A: 

Perhaps the best answer to your question is "What are you trying to do?".

I say that because if you're looking to build a web application now that works in modern web browsers (read: not IE) using HTML5, then the scope/boundaries that you're going to care about are what the main modern browser manufacturers are currently supporting and plan on supporting soon.

Google Wave did this and came up with a great product that works cross browser (Firefox/Chrome/Safari/Opera). Some basic tenants of HTML5 that are widely supported already are video/audio/canvas/storage/geo.

HTML5 Support

http://radar.oreilly.com/2009/05/google-bets-big-on-html-5.html

philfreo
I don't understand what the chart demonstrates. Nobody supports the majority of HTML5 in any sort of uniform way and nobody will for quite some time because its like 9000 pages long and includes all sort of extraneous features. SVG spec was nearly as long and as a result almost nobody adopted it and those that did adopt it have not adopted it fully.
The point of the chart isn't that ALL of HTML5 is supported uniformly, but that 6 of the big important features of HTML5 are indeed supported by 4 different browsers.
philfreo