demosthenes.info

I’m Dudley Storey, the author of Pro CSS3 Animation. This is my blog, where I talk about web design and development with , and . To receive more information, including news, updates, and tips, you should follow me on Twitter or add me on Google+.

web developer guide

my books

Book cover of Pro CSS3 AnimationPro CSS3 Animation, Apress, 2013

my projects

CSSslidy: an auto-generated #RWD image slider. 3.8K of JS, no JQuery. Drop in images, add a line of CSS. Done.

tipster.ioAutomatically provides local tipping customs and percentages for services anywhere.

The Differences Between HTML5 And What You Know

html / introduction

Estimated reading time: 5 minutes, 15 seconds

While they are referred to in different ways by WHATWG, the development of HTML5 could be said to have three principles that distinguish the language from XHTML: pragmatism, simplification, and looseness.

Simplification

The XHTML 1.0 Strict template for a basic page can be intimidating to look at, even overwhelming. By comparison, the doctype declaration for an HTML5 page is so simple you can type it out by hand:

<!DOCTYPE html>

“That’s it?” you might be saying. Yes, that’s it.

Common objections:

“This can’t possibly validate, can it?”

It really does. Go ahead and use the full HTML5 template below in a page if you don’t believe me: validate it as you would any other page.

“How does the validator know which version of HTML I’m using?”

HTML really isn’t about the doctype, which is optional. HTML is about tags being used in an internally consistent method in the document. If you start to mix HTML5 tags with obsolete elements, or in the wrong order, that makes the page invalid. HTML5 extends the HTML language, it doesn’t replace it, so all the tags you have learned in XHTML are still perfectly valid, and can be used in an HTML5 page, just as whatever comes down the pike after HTML5 will have to support the majority of HTML5 markup… so specifying which version of HTML you are using is somewhat redundant.

The meta tag to set UTF-8 encoding for the page has also been simplified; note that DreamWeaver CS5 does this wrong by default:

<meta charset="UTF-8">

The code to link a style sheet has been simplified:

<link rel="stylesheet" href="styles.css">

As has drawing JavaScript into a page from an external file:

<script src="file.js"></script>

The simplified JavaScript link is probably the easiest to explain as an example of this principle. In the past, there were many possible competing scripting languages for the web: LiveScript, Jscript and JavaScript, to name a few. In the present day, things have shaken out such that there is only one scripting language – JavaScript – just as there is only one way to modify the appearance of pages: CSS. If there are no alternatives, it makes no sense to specify what we are using to script or style a page: it will always be JavaScript and CSS.

The entire HTML5 page template is therefore:

<!DOCTYPE html>
<html lang="en">
<head>
<title>An HTML5 template</title>
<meta charset="UTF-8">
</head>
<body>
</body>
</html>

Technically, you could use HTML5 shortcuts to reduce this even further:

<!DOCTYPE html>
<html lang="en">
<title>An HTML5 template</title>
<meta charset="utf-8">
<p>Content

Looseness

XHTML – most especially the version of XHTML I have been teaching, XHTML 1.0 Strict – is very particular about the way in which it is written: all code is in lowercase, tags are always closed, etc. To me, this is a good thing: clear, concise rules are the hallmark of good governance and good code. However, learning rules is arduous, and making small mistakes in code can lead to big headaches in validation.

HTML5 frees up the rules: tags can be written uppercase, lowercase, or mixed. Most table elements, including <td>, <tr> and <th>, do not need to be closed. Neither do <body>, <dd>, <dt>, <head>, <li>, <p>, even <html>. (This is an anathema to many developers, and I would strongly recommend always closing your elements, unless there is a very strong argument against doing so, the best reason being that you are trying to reduce the size of files to the very least possible: in which case you should also be using a minified and removing all carriage returns from your code, and absolutely determining that your images are optimized until they squeak). In addition, some of these exceptions are tricky: you can avoid closing a paragraph if it is followed by one of 24 other elements, but not if the following tag is any one of over 100 others. Putting attribute values in quotes is optional. id values can start with numerals.

Form elements no longer have the requirement that they be wrapped by <fieldset> or even the <form> tag, partly as a nod to the common use of JavaScript to directly interpret form elements. In other words, this is perfectly valid on an HTML5 page:

<body>
<input type="text">
</body>

It is not a good idea – the example above lacks an id attribute value for JavaScript to hook into, and accessibility for the element is nil – but it is possible.

I still strongly recommend that you do almost everything I have taught as good coding practice thus far – everything in lowercase, all tags closed, all attribute values inside double quotes, use of the <form> tag where appropriate – as it makes it far easier to move to languages that are less strict, and makes your code easier to both read and debug. But I can no longer say it is wrong to do otherwise.

Pragmatism

A few elements were just a bad idea to begin with: frames and everything to do with them are obsolete in the HTML5 spec. Good riddance.

<iframe>, however, is supported in HTML5. I will talk more about iframes later.

Somewhat controversially, the <acronym> tag has been dropped. Now that every commercial browser supports the abbreviation tag (<abbr>), the <acronym> tag is somewhat redundant: the distinction between acronyms (words made of letters that form a pronounceable word in themselves, such as SAIT and laser) and abbreviations (the general principle of making new words from the leading letters of joined words) is lost on most people, and every acronym is ultimately an abbreviation anyway.

The <object> tag is pretty much redundant now in most instances (it has largely been superseded by the <audio> and <video> tags), but is still supported.

<strong> and <em> remain, but <b> and <i> make a comeback in HTML5. To me, the distinction WHATWG makes between the elements is so fine as to be hair-thin: I would recommend that you continue to use <strong> and <em> as I have taught you.

<big> is out, but <small> remains: it is now relegated to the markup for small print, as used on a legal document or warranty.

In HTML5 you can now enclose multiple elements with a link:

<div>
<a href="dudley.html">
<img src="dudley-storey.jpg" alt="">
<p>Learn more about Dudley Storey</p>
</a>
</div>

Technically, this move would be illegal under XHTML: both the paragraph content and the image should have been separate links. In the real world most browsers supported the code, so HTML5 supports it too, and makes it official.

comments powered by Disqus

This site helps millions of visitors while remaining ad-free. For less than the price of a cup of coffee, you can help pay for bandwidth and server costs while encouraging further articles.