tags:

views:

778

answers:

4

Why do people suggest minifying web assets, such as CSS and JavaScript, but they never suggest the markup be minified? CSS and JavaScript can be used on many various pages while the markup gets loaded each and every time, making minification of markup far more important.

+2  A: 

Markup tends to be dynamically generated these days, and even when static there's usually a bunch of pages. JavaScript and CSS are usually minified in a one-file-per-site manner and thus much easier to minify manually (or to script).

ceejayoz
+3  A: 

I suppose it's hard because sometimes things like white-space is used for formatting, maybe depending upon doctype.

Kieron
+11  A: 

One likely reason is that markup typically changes MUCH more often, and would have to be minified for every page load. For instance on a given Stack Overflow page, there are timestamps, usernames, and rep counts that could change with every page load, meaning you would have to minify for each page load as well. With "static" files like css and javascript, you can minify far less often, so in the minds of some, it is worth the work up front.

Consider also that every major web server and browser support gzip, which compresses all of your markup (quickly) on the fly anyway. Because minifying is slower and much less effective than gzipping anyway, webmasters may decide that minifying for every page load isn't worth the processing cost.

Triptych
CSS and JS are gzippable too, but minification still is seen as having significant benefits.
ceejayoz
Minimally significant. ~70% reduction by gzipping vs. ~5% reduction by minifying a gzipped file.
Triptych
Agree the gains of even basic Compression far outweigh the gains of minifying. minifying seem to be more about obscuring 'your' really 'cool' code.
Adrian
@Adrian I wouldn't go _quite_ that far. There are occasionally good reasons to save every byte you can. The reason that *I* hate minifying though is that it often makes in-browser debugging a pain, and there are usually much better ways to speed up a site.
Triptych
@Triptych That's why minifying is best handled via a framework. Drupal, for example, keeps the original files and lets you turn minifying off for development.
ceejayoz
For me these are separate domains. Minifying is about removing chaff, unnecessary material that doesn't affect the result. Compressing is about compressing the remainder. Gzip does great, but there's no point in gzipping <!-- end head div --> when we could reduce it to zero.
T.J. Crowder
@TJ It's always a trade off. My point is that webmasters might decide they don't want the performance hit of minifying a markup file for every page load BECAUSE gzip does a good enough job. Obviously minifying+gzipping will net you the most savings overall.
Triptych
If minification is a pain because it makes code hard to read or debug then why not have a beautifier ready? I have written one at http://mailmarkup.org/prettydiff/prettydiff.html
@Austin that won't help with messages like "error: line 1, character 13,045"
Triptych
That makes me think you have not ever used a beautifier for your markup. Try the one I linked to and tell if you could ever get to the 13000th character on a single line.
minifying reduces the amount of work the browser has to do once it has downloaded the file - it has less data to parse. gzipping increases the browsers work - it has to uncompress the file before it starts parsing it. So minifying is a good idea, weather or not you gzip as well.
rjmunro
@rjmunro - that was quite a leap of logic. You certainly lose more time minifying on-the-fly server side than you gain in parsing time on the client. Gzipping decreases the amount of data the browser has to download, which will generally vastly outweigh the time require to uncompress.
Triptych
@austin. The point is that when minifying, often all of your javascript code ends up on a single line. A tool like Firebug will therefore always report errors to be on "line 1" of the offending source file. Beautifying the code _afterwards_ will not help with tracking down where that error actually occurred in the original Javascript file.
Triptych
Triptych, it actually does. Minified JavaScript behaves, provided its written properly with curly braces and semicolons, exactly the same regardless of minification. So in troubleshooting beautify it, correct the errors, and minify it again. If a person's JavaScript is not well written because semicolons and curly braces are not worth their time then they have other more serious problems to address.
@Austin. You are missing my point completely. I am not saying that minifying introduces errors. Please reread carefully
Triptych
+5  A: 

Consider this:

HTML:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<title>Demo</title>
<link rel="stylesheet" type="text/css" href="nonminify.css"/>
</head>
<body>
<div title="My   non   minifiable   page">
    <p class="http://www.example.com/classes/class/lorem-ipsum"&gt;

            Lorem ipsum dolor sit amet, consectetur adipisicing elit, 

            sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. 

            Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris 

            nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in 

            reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla 

            pariatur. Excepteur sint occaecat cupidatat non proident, sunt in 

            culpa qui officia deserunt mollit anim id est laborum.

    </p>
</div>
</body>
</html>

With this css file:

div[title="My   non   minifiable   page"] 
      p[class~="http://www.example.com/classes/class/lorem-ipsum"]
{
    white-space:pre;
}

Given that, it's effectively impossible for a HTML minifier that can only see the HTML file to find anything that it can safely minify.

Alohci
I suspect that the white-space:pre declaration is the exception and not the normal as it is so very rarely used.
True, but it's not just white-space:pre of course. DOM walking JavaScript can also make assumptions about the presence of white space that a minifier can change. Strange though it may seem, white space is significant in HTML, whereas in CSS and JavaScript it mostly isn't
Alohci
I disagree. Whitespace in markup is tokenized during parsing, so its only relevant down to single space characters. If minified only down to that level there is no difference to the interpretation of the code. Try minifying your HTML using my tool to see for yourself: http://mailmarkup.org/prettydiff/prettydiff.html
White-space is tokenized during parsing sure, but every white space character is passed through into the DOM. see http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#data-state and http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#parsing-main-inbody. Collapse of the white space happens in the render phase by typically applying the white-space:normal css rule. If that wasn't the case, how could browsers possibly implement white-space:pre?
Alohci
I don't deny that probably 99% of HTML pages as used on the web could have their white space reduced without being broken, but there will be 1% where that's not the case. I wish you luck with your HTML minifier, but if it is used a lot, expect to get a run of strange bug reports from web authors blaming the minifier for breaking their web pages.
Alohci
Small correction. Not EVERY white space character is passed through, but those going into text nodes in the body are.
Alohci
@Alohci, I just noticed your comments. I wrote a markup minifier that does not interfer with the parsed output of content. All whitespace, unless there is a contrary presentation condition intentionally applied, in markup is tokenized prior to be parsed out and whitespace between tags, except singletons, is entirely removed. Knowing the correct whitespace rules for markup allows a condition where the markup can be minified without harm in an automated fashion each time.
good point, i had not considered this. this should really be the accepted answer, as it explains why it is technically not possible for html like it is for js/css.
Kip
I guess the only thing that an HTML minifier could really safely do is to remove any spacing around attributes - e.g. "<div id='foo'    class='bar' >" should always be exactly the same as "<div id='foo' class='bar'>". That's unlikely to make much of a saving, though.
Bobby Jack