The doctype goes first in the code of every web page you create. The code that follows it will usually be HTML, so we'll discuss that next.
HTML is specified through the use of tags, also known as elements or markup.
Tags specify what content is. The thing defined by the tag – the content – is placed between opening and closing tags. With very few exceptions, the closing version of a tag is always the first word used in the opening tag with a slash (/) in front of it.
For example, a paragraph would be specified as follows:
<p>This is a paragraph</p>
Note that the closing tag shows where the thing being talked about ends. Without the closing tag, everything that occurred after the opening <p> would be assumed to be continuation of the paragraph, unless that new content had specific markup of its own.
All tags can use attributes. Attributes are always written inside the opening tag, and almost always take the form x="y". Attributes add more information to the tag as to what it means. In the example in the header illustration for this article, we have defined the word SAIT Polytechnic to be an abbreviation by using the appropriate tag. The title attribute for the tag defines what SAIT stands for.
A few common rules for tags:
A tag is always specified between < (less than) and > (greater than) symbols.
After the less than symbol, a tag will always consist of at least one letter (the link tag <a> for example), although it may also include numerals (such as a heading element).
HTML is HyperText Markup Language, created by Tim Berners-Lee as a subset of the far larger and more complex SGML. In turn, XHTML and HTML5 are successors to this standard. To break down the HTML acronym:
is simply the ability to link one document to another through hyperlinked text.
means that HTML specifies (marks up) content – it provides meaning. This is a very important point. HTML (and its successor, XML) are not intended to specify how something looks; instead, they specify what they are: a paragraph, a heading, a quotation. The appearance of content is provided by CSS (and its successor, XSL).
is a misnomer. HTML is not a language in the classical sense. It is not a programming language, because it cannot create executable programs on its own. Do not put HTML on your resume under “Programming Languages” – any employer who knows his code will be able to spot you as a fraud. or at best uninformed.
HTML was made for scientists, not artists.
HTML is the lingua franca of the web, the means by which most documents on the web are encoded. It has become so popular that most programs now output HTML, even for documents that were never intended to be viewed on the web (Microsoft Word, for example (which encodes HTML terribly) or help files for programs).
However, there is a problem with HTML. That problem lies at the heart of its creation. Tim Berners-Lee, the developer of HTML, was working at CERN when he developed HTML as a method to present simple scientific documents on a heterogeneous network. (You can still find the original release announcement for the Web on Usenet). The first version of HTML didn’t even have the ability to display graphics. HTML was developed for scientists, not for artists. The original focus of HTML was on features that could be used in scientific papers: tables, lists, headings, and the like, and HTML never entirely left this inheritance behind.
As the web grew, artists wanted to jump on the bandwagon. Lacking any other means to do so, they took the functionality developed by Tim Berners-Lee and tried to push it towards design and appearance. HTML was never intended for this, and it coped badly. Designers jumped through all kinds of hoops – cutting up graphics and nesting tables inside tables inside tables, for example – to try to get the visual effects they were after.
The browser wars of the mid-to-late 90’s caused HTML to suffer further. Dissatisfied with the slow, academic advancement of HTML and driven to dominate the market, Netscape and Microsoft began to introduce their own proprietary code. This code looked like HTML, but it was only recognized in the particular browser made by that company (Netscape Navigator and Internet Explorer, respectively). Both Netscape and Microsoft pushed these “advancements” to web developers, who were forced to code for one particular browser if they wanted to use this new feature. At its worst point this war threatened to Balkanize the web, making web pages that could be seen in only one browser and not in the rest.
Thankfully, sanity slowly prevailed. The development of HTML was passed to an organization known as the W3C. New features such as CSS were advanced to support designers, returning the separation of appearance and content. While there are still some inconsistencies in how different browsers support these open, standardized features, the situation is continuing to improve.
XHTML was the immediate successor to HTML, and shares many qualities with it – you might want to think of XHTML as “strict” HTML, cleared of the accumulated cruft of <font> tags and the other presentation elements that crept in over 10 years. For many, this new standard was overly draconian and too limited in scope; HTML5 was the result.
People have developed web pages in the past usually have a poor understanding of HTML: this is a guide for correction.
Both my class and the entries on this blog are based upon W3C standards. Those of you who have developed web pages in the past, or had been taught to do so, may be in for a rude awakening. There are likely to be rules that you were unaware of, practices that you have been following that "always worked in the past", a new means of approaching code. For that reason it is especially important that you pay close attention in every class, no matter what your level of experience.
Code must be to W3C Standard
Meaning: all tags closed, correctly nested, code in lowercase, quotes used for attributes, where necessary. (HTML loosens these requirements substantially).
Tables are for tabular data
If you have information that would be displayed in a table format on paper – a class schedule, for example – then a table is acceptable. They are not acceptable under any other circumstance. Tables should be used to display exactly that – information that makes sense to display in table format. Tables should not be used for layout. Use CSS and <div> instead.
Frames were an intermediate solution to displaying boxed “floating” content on your site. Their only remaining purpose is to show the content of other websites within your own (which brings its own copyright issues). Frames make websites difficult to navigate, bookmark, and index in search engines. Avoid them wherever possible. Use <div> instead.
No <font> tags
Font tags have been superseded by CSS.
NotePad only for coding
DreamWeaver and other programs won’t teach you how to code. They are useful, and we will be using them later in the course to speed up development, but for the first class we will solely be using NotePad to code. (You are of course free to use any program you wish in your own time, but your instructor will not provide support for them, or for the code generated by such programs).
If you are unfamiliar with these terms, do not be concerned – they are only emphasized for those who will need to unlearn bad HTML habits.
Rationales for learning HTML before turning to an application like DreamWeaver
“Web design programs are the Japanese infantrymen left behind on the Pacific Islands at the end of World War II.”
Because WYSIWYG programs (DreamWeaver, FrontPage, etc) will only ever be able to create the version of HTML they were programmed to code. Web design programs can be compared to the Japanese infantrymen left behind on the Pacific Islands long after World War II had ended – the ones who still believed that the war was ongoing because no-one had ever told them otherwise. DreamWeaver 4 will only ever be able to create the version of HTML that was current during its year of release, at least as far as its visual design tools are used.
Knowing only the graphic design end of a program – be it a DreamWeaver or FrontPage – can be compared to only knowing how to drive an automatic car. If something goes wrong with the car, you are forced to take it to a mechanic. In web design, you are the mechanic.
Web design programs can introduce bugs into your code, or be incapable of coding a feature you need. Knowing HTML allows you to “get under the hood” of any web design program.
Web programs also tend to introduce more code than is needed for your page. (Microsoft programs, in particular, are notorious for this, but they are not alone). By “rolling your own” code you have a greater chance of creating leaner and lighter code, often 10 - 15% less in size than the equivalent WSIWYG-created page. This translates directly to decreased download times and reduced site overhead costs for bandwidth and storage.
Coding sites with standards compliance allows you to “future-proof” your work.
A typical web page will have a structure similar to most real-world documents: a title, heading, body copy (in the form of paragraphs) and several sub-headings dividing the body copy into different sections.
HTML has six heading levels, from <h1> to <h6>. It is very important not to think of these headings in terms of “size” or “bigness”, but as levels of importance. <h1> is the most important heading in the web page, and is commonly the first element after the opening <body> tag. In an XHTML document <h1> is typically only used once per page, and may contain content that is related to the page title, the company name, or the name of the website. (It is possible for HTML5 pages to use the h1 element multiple times in different contexts, but as of this writing that is not generally recommended, due to issues with accessibility software. <h2>, on the other hand, is a sub-heading, of which there may be several in a document. <h3> is a sub-sub-heading, and so on.
<p> tags demark paragraphs. Keep in mind that the content of every paragraph must be contained within an opening and closing <p> tag. A paragraph normally consists of several sentences; if <p> content consists of a single word or image, you should probably reconsider and use more appropriate markup.
Naming and locating files is vital to making a site
HTML pages should be saved with the .html extension. While this isn’t vital on every page, 90% of the time the home page of your site will have to be named index.html in order to be picked up as the default home page, so you may as well stick to this naming convention for all pages.
Use alphanumerics only in page names. That is, a-z, 0-9. The only exceptions are: -(dash), _(underscore) and ~ (tilde).
Never use spaces in the file name of anything destined for the web. Replace spaces with hyphens.
Use lowercase exclusively when naming files (some web servers are sensitive to case).
When planning a site, create a naming convention for pages and stick to it without exception.
Always title a page the moment you create it, using a titling convention you have created. Too often this task falls by the wayside and is neglected, resulting in pages that have irrelevant or confusing titles, or no title at all.
Remain aware where you are saving files, and be consistent about doing so. A great deal of frustration and confusion occurs because students are looking at the wrong version of the file they think they are working on. For that reason, I strongly suggest saving only one copy of any file: do not try to retain multiple versions of the same page. If you start a new page, give it a completely different (and appropriate) file name.
Be aware of changes made to files that you bring back and forth to class. To avoid confusion, I recommend creating files on your desktop and then dragging them to your USB memory stick / portable hard drive to take them home, trashing the files that remain on the desktop. If you return to class and want to resume work on the files, reattach the drive and drag the appropriate files back onto the desktop. (Alternatively, you may wish to work directly on the files on the external device, although this is not recommended).
Ordered lists are used when you want to make the order or importance of list items clear. For example, when writing a manual on the steps taken to defuse an atomic bomb, you wouldn’t want a simple list of bullet points. Much more appropriate would be a series of enumerated steps. (“1. Open the hatch. 2. Cut the blue wire” etc).
Ordered lists are enclosed by the <ol> tag. Each item in the list is marked with a surrounding <li> tag:
<h2>My top three favourite movies are:</h2>
“Unordered” in the context of XHTML doesn’t mean that the list items are randomly sorted on the web page: it simply implies that it doesn’t matter which order the viewer reads them in. An unordered list begins with the <ul> tag. Items nested inside this tag are still marked as <li> (list items).
<h2>My hobbies are:</h2>
Definition lists are typically used to define a series of terms. Under-utilised in web design, they are appropriate whenever you are seeking to make terms very, very clear, such as a legal document or a handbook. You can see a use of a definition list at the very start of this XHTML course, when I define what XHTML stands for.
A definition list consists of three tags. <dl> starts the definition list itself. The defined term is enclosed inside a <dt>. Finally, the definition itself (the definition declaration) is enclosed inside a <dd> tag. For example:
<dd>A small flightless bird, native to New Zealand.</dd>
The browser will present the definition list appropriately: by default, with the term in bold text and the definition indented.
Note that the definition list can be used with a far greater flexibility than the encyclopedic purpose for which we've used it here. A list of products may be a definition list; I have used a definition list (with nested forms of other lists) in the course outline pages in the recommended books section. In that case, the title and a picture of the book are the definition term, while further details and an explanation of the book's purpose is a definition declaration.
Also note that a definition term may have multiple definition declarations beneath it. (Consider, for example, the multiple possible meanings of the term "haunt" in a dictionary).
Lists can be nested inside each other. Note that doing so will indent the inner nested list(s) inside the outer list. For example:
<h2>Things to do Today</h2>
<li>HTML and Web Design</li>
<li>3D Studio Max</li>
<li>Take citizenship test</li>
Note the nesting of the first <li> tag around the nested list. Another example:
<li>A small flightless bird, native to New Zealand.</li>
<li>A person from New Zealand
There are many ways of customising the appearance of lists, all of them the role of CSS, which we shall get to shortly.