Skip to page content or skip to Accesskey List.
Search evolt.org
evolt.org login: or register

Work

Main Page Content

Why standards-compliant HTML matters

Rated 3.38 (Ratings: 4) (Add your rating)

Log in to add a comment
(13 comments so far)

Want more?

 
Picture of raphael

Raphaël Mazoyer

Member info | Full bio

User since: November 21, 2001

Last login: May 20, 2007

Articles written: 4

Web technologies have always been misused, to achieve effects they were not (yet) designed to achieve. While a very common and natural drive in humans, and sometimes a very fruitful approach, creative use of computer tools has got important negative consequences on the quality of web products.

When the first graphical browser, Mosaic, was released, nobody talked yet about the layout of pages. It was a graphical user interface for an essentially text-based browser, and the web was still all about content. (Side note: content, and ASCII art--another creative use of technology that we humans can perceive and appreciate, while it makes no sense at all to the computer.)

The first additions of features like the ability to add inline graphics did serve information purposes, but it also triggered creative, unintended uses. Netscape 2.0 introduced tables, and things continued evolving dramatically in that direction. Proprietary extensions that were soon turned into the official HTML standard allowed webmasters to display tabular data, but their use was quickly repurposed. Tables, images, frames and font colors, designed as containers for a certain type of information, were (and still are) being used for visual purposes.

Bad examples

The frame was supposed to separate independent elements of content and to allow the rational display of content and navigation, or more generally of pieces of content that are related but of a different nature. The actual use was pretty close to the original idea: here the content, there the logo and main nav, here the footer and copyright notice. However, empty frames were quickly added to emulate various behaviors deemed desirable by someone on the project team: to display only one URL for the entire site (at some point, some considered it clean!), or to center the site's content in the browser window.

Similarly, tables were meant to organize information in lines and columns. However, it became clear pretty quickly that with invisible borders, tables could be organized to cut out regions of a controlled size in pages, to hold together in a visually satisfying manner various pieces of text or images, to selectively apply a background image or color, etc. More than abundant literature was written on the topic, and some graphic design packages even included the ability to produce table-based web layouts on the fly.

Again, inline images (i.e. embedded in the text) were intended to display, well, visual information: graphics, photos, logos, and the like. But those were soon repurposed to participate in layout and branding. Overall page visuals, as created by graphic designers, were being built out of bits of images put together in a totally meaningless manner--which looked good in a graphical browser. Large images containing everything (from the site's identity to its textual content) were a way to fully control the layout. But the crowing glory of this practice was the spacer gif, a 1x1 pixel image which conveyed strictly no information at all, but enabled HTML coders to control the behavior of tables quite finely.

Layers and scripts, <font> tags and other remnants of the first browser wars show the same the repurposing pattern.

Creative (but unorthodox) use of technology

Creative minds were applied to the problem of creating visually appealing pages with the limitations of the very inappropriate tools available. And creative solutions were found. And the web started to look good. Designers became more and more astute at using those tricks, and attained a certain balance in the efficient but totally unorthodox usage of the technologies.

The problem here is that each site was creating its own standard of information encoding. This essentially defeated the original purpose of HTML and neutralized one of its most powerful uses: packaging information in a way that the computer can process, even if the other major purpose, getting information across to users, was reached in most cases. Basing their work on the reality that most users would access the content via a certain browser, web builders started tweaking their input until they were happy with the output.

Ideally, all web sites should feature structurally correct markup: <h1> describing a top-level header, <blockquote> wrapping a quotation block, headers organized in a logical way, <strong> or <b> applied to mark emphasis, tables to display tabular data, etc. When you have this, you reach two goals:

  • get your information across to your users: because all browsers will display that structurally correct code in a good (but probably boring or ugly) manner
  • make that information usable by a computer: any program capable of parsing HTML can be made aware of much of the content's context, structure, and highlights

But the actual practice was use to those structural elements for the visual effect they were having. And web designers would mix and match happily until the desired effect was reached, when viewed in a particular browser. Besides the enormously costly cross-platform discrepancies that arose from that practice, the consequence was HTML code that did not separate the content and its structure from the presentation.

Terrible consequences

All sorts of bad things happen when you do this, most of them related to the usefulness limits of non-intended use of technology:

  • lessened impact on search engines, as related information may be technically unavailable for the search spiders despite being visible to most of your visitors
  • access difficulty for people with another browser than the one(s) you used as your reference target, such as people with a handicap
  • bandwidth waste, as the ratio between content and markup in your HTML code plunges to abysmal levels
  • terrible difficulties in maintaining the content, changing the layout, and dealing with new access technologies (mobile phones, PDAs, RSS feed readers).

Nowadays, content is often pulled from the database of some Content Management System, and the ugly structural-markup-for-visual-purposes could be part of the display template. In that case, the last consequence obviously no longer applies. But then you get something else:

  • content delivery must be handled on a per-method basis, with a different template handling the display for each different access method

And that does imply that you only used clean markup when maintaining content in your CMS, which is not always possible (for example, browser-based rich-text editors such as the very clever and powerful HTMLArea deliver terrible code).

So the case behind standards-compliant, structurally correct HTML markup is the following: because it makes use of the features of HTML as they were designed, it is efficient in what it does (describing the nature of various pieces of content that compose a document), and it is future-proof (if need be, good but deprecated markup can be easily converted to a newer version).

This does matter

In the practice, this must serve as a warning against creative and entirely unintended uses of a technology in the domain of content management, storage and maintenance. Of course, we're talking here about taking this element into account in your decision process: in some cases, the short-term benefits may outweigh the long-term drawbacks. Sincerely going through the list of negative aspects above with then next few years in mind may help finding a site's sweet spot in this compromise.

Take some time to carefully review the costs of the drawbacks above:

  • "searchability" (increasing your site's affinity with search engine spiders, ensuring that the site's information registers in search engines) greatly increases the impact of a site, and is a very affordable improvement (a "low hanging fruit" waiting to be plucked). Many other factors are important in this regard, such as the number and wording of links leading to your site, but while not difficult at all to achieve, structurally correct markup can help very much, particularly for specialized content (business-specific vocabulary, original material, etc.).
  • while a limited population (say, people with a major visual handicap) might be in so small numbers in your intended audience that they do not seem to be worth the effort, bad press, customer complaints and legal pressure may force you in the end to cater for their specific needs. And then, patching things up once the site is up is often much more painful and expensive than doing them right them in the first place.
  • care about bandwidth waste may sound ludicrous now, in our days of cheap and plentiful broadband access. But bandwidth usage is not always free, and if it is, the hosting provider may still choose to shut down your site for abuse. In all cases, having a wasteful site puts a larger price tag on success, thereby lessening one of the positive aspects of the online medium.
  • at this point, many sites are generated from the database of a CMS, and therefore the impact of content repackaging is usually limited to creating a new template. However, the repackaging trajectory for a given target platform may be more or less costly depending on the current state of the information.

Assessment in your context

In the context of a particular web site, the web builder will have to assess the importance and relevance of the various factors influencing the technical realization. The checklist above can help, but many other factors come into play, such as the intended audience and its social and technical characteristics, the expected lifetime of the site, as well as its lifecycle (the way it will be built, used, reused, and archived or destroyed). And the cost of standards-compliant building must be taken into consideration.

Regarding costs and efforts, here are a few hints:

  • old habits die hard, and several activities, such as visual design and coding, need to be tackled in a radically different manner. Switching to standards-compliant coding is not an implementation detail, and it may cost a team quite a bit of effort. Don't decide this at the last minute without adapting your project timeline.
  • tables-based layouts are easy to build and rather reliable (within their limits), while CSS-based layouts on the top of structurally correct HTML are still quirky and tend to irritate graphic designers, as they are quite a departure from the safe tables- and paper-based design. Ensure that all parties involved in designing and producing the visual aspect of the site are aware of the change.
  • it took time to build up experience in tables-based layouts, it will take time to come up with solutions to all problems for CSS-based layouts. It is normal that something really easy to do with tables should take more effort with CSS, and it will also take a while for everyone on the team to know how to relate to other people's tasks in the new context.

Not for you?

In some cases, you might decide that the standards-compliant route is too expensive and too troublesome for your project.

Building for an intranet with a very controlled park of machines and browsers is not really an argument against standards-compliant building, but may only be considered in combination with other, more compelling factors. Indeed, while designing for one browser may save a bit of effort, or provide extra possibilities, it also increases the reliance of the site on external factors you have no control over: not only could the browsers change, following a company-wide policy reversal, but the company making the browser could choose to evolve the software's technical characteristics in future versions. With standards-compliant code, you're safe in both cases.

The single most relevant reason for building a site that is not standards-compliant is time: if you are creating an event site which will be taken offline within a few weeks or months of its inception. In that case, layout and content will probably be extremely tightly integrated (as in "rich media" sites...), and you won't care much about the drawbacks mentioned above.

However, in most other cases, there is simply no reasonable business case to be made for non-standards-compliant web building, as it is very costly in the long term, while not being necessarily expensive in the short term.

What can you do

Just switching over to such building methods isn't simply a case of reading a few good books and "doing it". You have to want it, and to rehearse a lot. In the practice, watch out for the following:

  • the graphic designers should be aware of and familiar with the characteristics and peculiarities of good HTML+CSS web sites, to use them at their best
  • someone has to be responsible for content markup and structure, which is a different job than HTML coding (but requires some understanding of the same technology), and a different job than copywriting (but also requires some understanding of the content and the communication objectives)
  • the technology must be tamed for that specific use: some CMSs simply can't handle this in a proper way
  • all stakeholders, including the client, should be (made) aware of the benefits as well as the requirements

Suggested further reading

For project managers: W3C tips to set up a standards-compliant web site and to choose a web agency.

Much is already available on what is actually needed for standards-compliant web building, particularly from A List Apart and in Zeldman's book (primarily targeted at graphic designers making the switch from a paper-based approach).

For the more technical people: The web standards project's blog about standards-compliance, and the W3C validator.

A EMMA/Digital Media Design graduate from the HKU in Holland, Raphaël co-founded the now-defunct web consulting outfit Splandigo. My personal site: Petit Bourgeois: commentary about web building. I have left Splandigo in August 2005, and have joined Weathernews in Makuhari, Chiba prefecture. I coordinate the work on our 30-odd European mobile and web services. You can reach me at evolt.contact [at] direct.phase4.net.

Bandwidth

Submitted by Xanadu on January 28, 2005 - 06:37.

Designers should always strive to make their site as lean as possible. I still see BMP files used on the web! A compressed image format should always be used instead as the files will be much smaller. I also see large images scaled down by the code to fit a smaller size. These should be resized in a graphics program first. (Some designers like to use actual full-size photographs as thumbnail images. A very bad habit - I always press the STOP button on my browser when I see these loading. Perhaps the designers think it's cool how the photographs appear instantly when you click on the thumbnail (as the image has already loaded!) or they just don't know how to resize them.)

Code such as PHP should also be made compact. I often update my code when I find a way to make it smaller or faster.

Bandwidth should be considered an issue because not everyone has broadband! A lot of people seem to ignore this fact, thinking you must be daft not to upgrade from a 56K modem, but not everyone can. (Broadband may not be available where the user lives, or they cannot afford the monthly fee.) So if you use a lot of large files, your site will be slower for non-broadband users. And what about mobile (cellphone) speeds? Again they are likely to be quite slow.

login or register to post comments

Re: Bandwidth

Submitted by Alfatrion on February 8, 2005 - 05:35.

Offtopic: I agree with Xanadu about the bandwith. If the loading time takes to long people will find you site slow and don't wait for it to load. People find a website fast if it downloads below the humon responce time (2 seconds) and slow if it above 4 seconds. This means striving to keep each page below 30KB. But I disagree about the PHP. This code is executed on the server and therefor it doesn't have to be small, but it does have to be fast.

login or register to post comments

bandwidth does matter, and revisiting your code

Submitted by kasimirk on February 14, 2005 - 02:35.

While 56k modem users' numbers may be decreasing, mobile users are becoming more common, and for them bandwidth does matter.

Standards compliant code (be it HTML or some programming language) has one important advantage not mentioned here: when you or somebody else has to revisit the code five years later, it is esier to figure out what the code is intended to achieve.

login or register to post comments

None

Submitted by biolight on February 16, 2005 - 12:36.

Compacting your php code is totally off. It doesn't matter how many lines it is... it matters how effecient your operations are, and the structure of your application.

There are a few unmeasured costs of XHTML that articles like this one often fail to mention:

1) Training: Many companies cannot afford to change gears midstream, and many employees are not willing to stick their necks out to implement new technology. It takes a really strong position to advocate web standards at a bank for example. Getting folks fluent (not just enthusiastic has a bottom line dollar cost)

2) A clean break from existing technology is necessary if you're using any kind of application methodology. This can be insanely difficult on legacy systems, even those of moderate compexity.

3) Future resistance? Despite the recent success of FireFox (I'm a user), the vast majority (and I mean VAST) are still using a Blue E for browsing the Internet. M$ has put fourth the most meaningless press release I have ever seen about IE7. They included NO details about what IE7 was going to be. It may break standards, in which case we'll find ourselves in quite the delimma.

4) Beyond bandwidth and Gov. compliance (both good things), there are only a few compelling reasons to convert existing commercial sites (WAP brower compatability? Is that a big money maker for most business websites?)

While I agree that we should be moving towards web standards, they are not the panacea that many developers put them fourth to be. I would never architect a new project without using web standards myself, but I think sometimes looking at the business impact of new technologies. I support web standards, but I think it's also important to look at some of the challenges that are making implementation so slow. Are there any I missed?

Jonathan

login or register to post comments

The price of new technology vs. known stuff

Submitted by raphael on February 16, 2005 - 23:57.

Clearly, an application that can't generate standards-compliant code, or for that matter that has difficulty changing the type of code it produces is a hazard, and a major hurdle. Internal applications in general do obey different rules, because of this control that the IT department feels it has on the whole environment--control that is often abused (thereby creating significant interdependencies between different parts of the infrastructure, where none need to exist).

However, while I quoted the searchability with Google and the public internet in mind, it seems obvious that information pages on an intranet that aren't searchable because of the way they're organized and coded, or a web application that causes its users to waste significant amounts of their client's time (following a crash or a major error in the interaction process) -- are both examples of costs that can be brought into the picture (this is not fiction, I personally witnessed both situations).

The current hot topic of folksonomies might ultimately render searchability moot, while good interaction design and quality coding might also be achieved with non-standards-compliant technologies. However, my point was: using technologies in the way they were intended to be used adds their power to your own goals, instead of merely allowing you to reach those.

IE-type HTML used in a 300-screens-long page containing all products offered by a bank is both bad design and bad thinking, and it's a major business hazard, leading to a significant waste of time and opportunities. Standards-compliance can be seen as the least important factor in fixing the situation, while being costly and difficult to implement, for the reasons you rightly mentioned, Jonathan. However, I see it as part of a package that ensures that technologies will actually support your goals. Yes, tables-based HTML gets information on a screen. But well-formed XHTML means you've thought about your information structure, can be parsed by the internal search engine spider, can be output to other formats than screen reading, can be maintained more easily, etc.

WAP browser compatibility sounds ludicrous, I must admit. But how ludicrous is it, to be stuck in a non-business setting with no internet access handy, with a potential big client nearby who's expecting to get actual information from you about your company's products or services, and to be unable to pull that information from the intranet because it's not possible to navigate it on your mobile phone (while it very well could be)? How ludicrous is it to have to equip your sales force with a certain type of PDA, regardless of the rest of its merits, because the company's intranet simply can't be accessed with non-IE browsers? How ludicrous is it to be unable to deal with a certain disability within your workforce, when it poses no other problems than the non-accessibility of the information infrastructure?

Yes, standards-compliant code has a significant cost, particularly in complex environments with important dependencies. But I believe that using technology in the way it was intended to be used ensures major benefits. And while getting the IT people at a bank to change their minds about anything at all is difficult, it seems to me that they have no valid point in sticking to a non-modular approach of the display layer.

login or register to post comments

brilliant article

Submitted by jessicalo on February 25, 2005 - 08:40.

Damn that article is brilliant. I found it very interesting. -jessica

login or register to post comments

maintenance

Submitted by trellix78 on March 1, 2005 - 20:05.

This article is very timely. I'm stuck maintaining a private intranet site that's being converted to public use with over 3000 documents scattered over 5 gigabytes and it's driving me nuts. The original author(s) are long gone and the original software used to create the html pages is no longer available. It's fraustrating telling my client that it'll take me 5 hours to make a simple change because either I have to buy software to make sense of a horrible table-based layout or scroll through a thousand lines of HTML code myself to find 200 lines of content. Maintenance of any kind (of software) becomes a very big and expensive issue when standards aren't followed.

login or register to post comments

IE holding back the web for years

Submitted by hesido on March 3, 2005 - 06:21.

We can talk about standards all day long, but IE doesn't care about them. I will go as far as saying that this done on purpose.

  • PNG, accepted in 1996 with its multi-bit alpha could have saved hundred thousands of work hours, simply because you would never have to worry about aliasing on the edges of your buttons, no matter what background they were on. Yet, it is not implemented in IE6
  • Min/max height and width, a css standard for years, which would enable you to do layouts that scaled perfectly for big monitors yet wouldn't be rendered silly when scaled down, is not supported. How hard is this to code, I wonder.
  • Box model still doesn't behave as w3 standards, div enlarges with floating content.
  • The list goes on...

To remind the reader: Almost all except proper PNG support can be achieved using some hacks. The PNG hack is there but it is unacceptable for me. The thing is, MS could have activated that hack by default for all PNG images, yet they don't. Even if they supported png this very day to IE6 (which they won't, be sure of that), it would take 3-4 years before people would be able to implement it due to the old IE's flooding the world. I personally adhere to standards, but I don't have the face to tell others to use them, all because of the MS monopoly and its standing against standards.

login or register to post comments

Re: IE holding back

Submitted by raphael on March 4, 2005 - 01:19.

You definitely have got a point, here. However, I contend that designing for web standards also requires changes in the visual approach, and particularly a departure from paper-based, full-control layouts. IE bugs and other unexpected behaviors are never as annoying as when a designer is trying to achieve a specific effect at all costs. Not that we should simply give up on controlling our designs, but I do believe that there is untapped power in using the technologies as they were meant to be used, in a somewhat more flexible approach to results. Not only does it ensure that products fare better in the long term, but it also generates less frustration when dealing with browser issues. The need for hacks is a symptom that a technology is being abused, which is not morally wrong, but often proves to be much more hassle than the designer or web builder thought she was bargaining for.

login or register to post comments

two years later...

Submitted by AxelF on March 14, 2007 - 09:23.

Ok, this article is a litte bit old but I newly-discovered it right now. At the moment I'm trying to implement an PHP-solution for mobile devices (cell phone) and the LACK of any standards there is very annoying (even with opera mini). So I have to absolutely agree to your article! Even now - two years later - its not easy to code for multiple platforms. Regards Axel

login or register to post comments

Heartfelt plea

Submitted by t4tw on June 28, 2007 - 01:32.

An optimized kind of code is indeed essential in a country like Poland when you realize that many people are still exposed to relatively low bandwidth and are limited to long loading times. In the area where I live I myself ‘suffer’ from the lack of access to a source of fast Internet, not because I cannot afford it, but because of certain physical restrictions. So, at this point I would like to make plea to web page designers. If you happen to be one, please do us a favour and create pages that could be accessible to anyone! My own site www.t4tw.info is built according to standards and is maximally optimized to meet the demands of the poorest bandwidths. If you are a Polish user I can also recommend checking the translated documents concerning the subject.

login or register to post comments

I thought you should know about this.

Submitted by nicksoper on November 13, 2007 - 13:09.

Sorry for my first post being so lame, but I thought you should know: The W3C validator shows some errors - kinda contradictory of the title of the post

login or register to post comments

The access keys for this page are: ALT (Control on a Mac) plus:

evolt.orgEvolt.org is an all-volunteer resource for web developers made up of a discussion list, a browser archive, and member-submitted articles. This article is the property of its author, please do not redistribute or use elsewhere without checking with the author.