This page documents our conversion to HTML5 from HTML4.01 Transitional (started late 2013). It is more in the nature of a checklist for us than any deep or insightful HTML5 article. We, so far, use no genuine HTML5 features - way too premature in our view (there is not, and never will be, a stable standard for goodness sake - just a flaky reference to Implemented and widely deployed). Our motivation is simply to get pages to validate as HTML5. At this stage (November 2013) this simply means removing all the negative stuff that HTML5 validators croak on. And the poor old W3C page validator, which used to be a thing of great precision way back when, now validates every HTML5 with one warning which says (we paraphrase) "...this page validates at this precise second as HTML5 but may never, depending on the whim of some unaccountable group of individuals, validate again..". Just the sort of confidance building statement you want in a validation tool. HTML has gone the way of serbo-croatian. They must be tearing their hair out in the W3C validation group. We feel for them. Horrible state of affairs.
W3C Version of the WHATWG Living Standard (With some subtle differences - what joy)
Our pages use the following capabilities that we care about:
CSS2 - we have no CSS3 and little desire to use it at this point. Very snazzy and potentially useful....but. Maybe we have, unconsciously, developed the site within the limitations of CSS2 over the years.
We use CSS class attributes almost exclusively in HTML. We have a number of style attributes lying around because we were too lazy to build a new class definition. In general, we regard the use of style attributes as poor policy for which there is really no justification and are slowly trying to eliminate them. Slob-like behaviour gets what it deserves. We eliminated almost all of the font tags in 1999 when we started the move to CSS. Horrible things.
We have a lot of HTML attributes controlling style still lying around on the pages (width=, cellspacing= and so on). Quick layout fixes, too lazy to update the CSS, too little time to think about the implications for other stuff that uses the same class definition, too stupid to understand the CSS specifications - lots of reasons - all bad. The HTML stuff works fine in HTML 4.01 (well it validates), but croaks in HTML5. While a pain to remove, we agree this is generally a Good Thing ™ (and probably about the only thing we 100% approve of in HTML5 - we're sure the designers are happy to have our endorsement). Complete separation of text and formatting and all that motherhood. However, it should also make writing good CSS based word processors viable and document conversion much more flexible. Finally, write-once-use-many-times may be a plausible goal.
Here is what we needed to do to our pages:
Remove all those carefully closed tags that we used in preparation for XHTML starting in 2001! They used to work fine with the HTML4.01 validator - they were even the recommended procedure by the great and good. So when the great HTML5 betrayal started, even the HTML 4.01 validator croaked on them. Global find and replace for ' />' with '>' (img, br, hr, input). But on every page?
Replace < name=""> with an id="" in the nearest container. Lot of pure manual work. And our pages do a lot of internal linking as a reader service. This one is hard labor with no real upside. Ain't nothing changed at the end of it - 'cept the pages validate as HTML5. Yipee!.
Remove all the HTML attributes that control layout. width=, align=, cellpadding=, cellspacing=, summary= (they used to force us to use this for accessibility), border=. We needed to upgrade a lot of our CSS definitions, write a bunch of new ones and start using multiple classes of the form class="class1 class2". Here the overall impact, once we found and made all the CSS changes, is positive. The HTML is much cleaner. Grudging acceptance of the bright shiny future.
With one exception. "How do I love thee? Let me count the ways." (Elizabeth Barret Browning) "How do I center something in HTML? Let me count the ways" (W3C). A long time ago, when the world was young and we were foolish, we used align="center" on anything and it worked. Now it depends on this, and that and whether it's Thursday. Perhaps this is called collateral damage.
Spaces in URLs. Understand why it is vital, but doing it at the HTML level? HTTP code needs to convert these strings, what happened to it. Did the programmer die? Code still has to run through the URL strings for safety reasons to make it HTTP friendly - now it finds nothing from static HTML5 pages. Wow, that really makes the Internet safe. We can see no justification for this one. Luckily for us it seems all related to social media - yet another reason to hate this stuff.
Problems, comments, suggestions, corrections (including broken links) or something to add? Please take the time from a busy life to 'mail us' (at top of screen), the webmaster (below) or info-support at zytrax. You will have a warm inner glow for the rest of the day.
If you are happy it's OK - but your browser is giving a less than optimal experience on our site. You could, at no charge, upgrade to a W3C standards compliant browser such as Firefox