Apple's decision to promote KDE's KHTML rendering engine and, by extension, the KHTML-based Konqueror to Major Browser status makes web standards important again. Webmasters who test in only Microsoft Internet Explorer are going to have Linux and Macintosh users, not only the former, complaining about broken HTML.
Fortunately, the new Apple browser generates a confusing User-Agent header, which helps discourage "browser sniffing". Better to make your site correct, anyway.
If your site obeys the standards and a browser messes it up, you can count on the now-competitive browser developers to fix it. If your site is incorrect, expect complaints.
So, how do you make sure your site is valid HTML and not simply cut-and-pasted, "looks fine to me" HTML?
Liam Quinn's WDG HTML Validator is written in the nearly-ubiquitous Perl and runs as a CGI script, so you can install it on one system and use it from anywhere. You don't need to install it on your production web server, any system on the Net will do.
You can try out the Validator on the WDG site, but if you have a lot of pages to fix, it's faster and more polite to install it at your own site. WDG also has a nifty set of HTML tag reference pages, linked to and from the Validator results, that help you understand and fix your mistakes. I installed it in minutes from the Debian packages; RPMs also are available.
I dropped in the URL of my fresh, clean personal home page, created with stylesheets in my best attempt at HTML 4.01 Strict, and foolishly expected it to validate cleanly. No way. The Validator started complaining beginning at the <body> tag.
Error: there is no attribute BGCOLOR for this element.
What? I've been putting bgcolor in body tags almost as long as I've been writing HTML! Time to hit the book, Dynamic HTML: The Definitive Reference, 2nd Edition and see what's up. Aha! This attribute is deprecated in HTML 4.01, and I'm using "strict" DTD, so it's time to move bgcolor to the stylesheet where it belongs. It's not a big thing, but it makes the actual page a little smaller and lets me change all the colors in one place.
body {
background-color: #aaaaaa;
}
Now, next time Talk Like a Pirate Day rolls around, I can change everything to white text on a black background with a single edit to the stylesheet, then concentrate on me prose, mateys.
But what's this? My page has a link to http://news.google.com/news?q=linux&scoring=d to easily catch up on the Linux-related news. But the Validator says:
Error: general entity scoring not defined and no default entity
Fortunately, that's in the Common Problems section. Time to replace that ampersand in the link with an & entity. Here's another one:
Error: element NOBR undefined
I hit the book, and it turns out the <nobr> tag was never standardized at all; it's "folk HTML" that browsers happen to recognize. In this case, I'll delete the tags, chill and let the browser flow the text the way it wants.
The next item on the list of HTML mistakes occurred on line 129:
Error: end tag for TT omitted, but its declaration does not permit this
followed by this on line 131:
Error: end tag for element TT which is not open
Aha! This plainly is sloppy HTML. I had a <tt> started inside a <p>, but the </tt> was after the </p>. It looks fine in the browser I use, but this kind of mistake is exactly the kind of error that makes different browsers react differently. Remember, most of the significant differences among browsers are in how they react to mistakes and not how they deal with correct HTML. Before you start sniffing User-Agent and such ugliness, make sure your pages are standard.
Next, in a line with a <blockquote> tag, there's
Error: character data is not allowed here
Checking the reference page linked to from the Validator results, here's the problem: "The content of the BLOCKQUOTE element should be contained within other block-level elements, typically P." Time to make sure that instead of using <blockquote>, I'm using <blockquote><p>.
After a few more errors, the process gets tedious. Why didn't I validate this thing to start with and fix errors one at a time? Why did I write a quick-and-dirty conversion script that wasn't careful about matching <p> and </p>? A quick detour to a friend's page shows his first error on line 1. Ha! It's not only me.
All along, though, the Validator output makes it easy to track down the problems. Using Mozilla tabs, I can pop between the page in the browser and the Validator results.
Finally, the "Congratulations, no errors!" message appeared. Fixing a personal home page is a tiny amount of work compared to repairing damage in a deep, automatically-generated site. When writing software to crank out HTML automatically, it's worth the extra time to feed the output through the Validator to make sure it's right from the ground up, instead of having to troubleshoot when a new browser, or new version, comes along.
Buy HTML: The Definitive Reference, 2nd Edition from Powell's, our partner bookstore.
Don Marti is editor in chief of Linux Journal.
Special Magazine Offer -- 2 Free Trial Issues!
Receive 2 free trial issues of Linux Journal as well as instant online access to current and past issues. There's NO RISK and NO OBLIGATION to buy. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
Sorry, offer available in the US only. International orders, click here.
Subscribe now!
The Latest
Featured Videos
The X Window System is a magnificent platform for many uses, but using it to run an application over a slow network is nearly impossible. This is an introduction to NX, a technology that makes remote applications fly even over commodity internet.
Linux Journal Gadget Guy, Shawn Powers, reviews the Flip Video Ultra, a small portable video camera, and shows us how easy it is to edit the video with Kino.
Thanks to our sponsor: Silicon Mechanics
Recently Popular
From the Magazine
September 2008, #173
Feeling a bit like a Thermian? Never give up, never surrender! Someday, you could go from underdog to top dog. Just take a look at a few of the underdogs we highlight in this issue: Mutt, djbdns, Nginix, Gentoo, Xara and the program voted mostly likely to fail just a few years back—Firefox. If Firefox not radical enough for you, check out Chef Marcel's column for some more alternatives. Having trouble mapping your program data to your relational database? If so, Rueven Lerner shows you some tricks in his At The Forge column.
Need to run GUI applications on your server in the next state? In his Paranoid Penguin column, Mick Bauer shows you how to do it securely. Kyle Rankin keeps hacking and slashing and shows you a few split screen secrets you may not be familiar with. Finally, we all know what happens next February, but only Doc knows what happens afterward.
Delicious
Digg
Reddit
Newsvine
Technorati







Re: Fixing HTML with the WDG HTML Validator
On January 23rd, 2003 Anonymous says:
Then you check your stylesheet and find you should not have :
body {
background-color: #aaaaaa;
}
but something like :
body {
background: #aaaaaa; color: white ;
}
just never ends....
Re: Fixing HTML with the WDG HTML Validator
On January 19th, 2003 Anonymous says:
What about Mozilla? Isn't Mozilla Standardized enough for you? Let's all dump our GNOME installs, and only use KDE and Konquerer because DonMarti says we need to "All Hail KHTML".
What's that? Mozilla is too big? Use Phoenix then. Or Galeon.
DonMarti, stop polluting the world with KDE Propaganda, and let us have our freedoms of choice. If I want to use KHTML based browsers, then I have the choice to, except I don't choose to use it.
Re: Fixing HTML with the WDG HTML Validator
On January 23rd, 2003 Anonymous says:
Dude... Chill.
Re: Fixing HTML with the WDG HTML Validator
On January 21st, 2003 Anonymous says:
>DonMarti, stop polluting the world with KDE Propaganda,
>and let us have our freedoms of choice. If I want to use KHTML
>based browsers, then I have the choice to, except
>I don't choose to use it.
Okay, then don't use it. You still have the choice, for goodness' sake.
completely missing the point.
On January 20th, 2003 xtifr (not verified) says:
I don't use KDE (or GNOME), I have little or no direct interest in KHTML, but nevertheless, I think it was a good headline, and I think you're missing the point when you complain about it. Don isn't saying that everyone should switch to KDE. He's saying that because Apple is now using KHTML, instead of IE, that means that the decision to support only IE on your website is even more of a bad idea than it once once. The fact that KHTML is about to become very popular is good for all of us -- I'm a Mozilla user myself, and as a Mozilla user, I'd like to join Don in saying "all hail KHTML!" :)
Re: Fixing HTML with the WDG HTML Validator
On January 19th, 2003 Anonymous says:
> I hit the book, and it turns out the <nobr> tag was never
> standardized at all; it's "folk HTML" that browsers happen to
> recognize. In this case, I'll delete the tags, chill and let the
> browser flow the text the way it wants.
in fact you can use ' ' to emulate <nobr>. see also
http://www.w3.org/TR/html401/struct/text.html#h-9.3.2.2
cheers....
Re: I love
On January 20th, 2003 Anonymous says:
The problem with using is that it has to be used between all words
grouped, etc. Also, depending on the browser pushing 2 image tags side-by-side
won't necessarity prevent a line break. <nobr> is very effective here as well.
For a tag that hasn't been official in ages, <nobr> is widely supported and very
handy. I use it but never count on it.