Site home page
(news and notices)

Get alerts when Linktionary is updated

Book updates and addendums

Get info about the Encyclopedia of Networking and Telecommunicatons, 3rd edition (2001)

Download the electronic version of the Encyclopedia of Networking, 2nd edition (1996). It's free!

Contribute to this site

Electronic licensing info

 

 

HTML (Hypertext Markup Language)
Expanded version: contains additional text not in the book

Related Entries    Web Links    New/Updated Information

  
Search Linktionary (powered by FreeFind)

Note: Many topics at this site are reduced versions of the text in "The Encyclopedia of Networking and Telecommunications." Search results will not be as extensive as a search of the book's CD-ROM.

HTML is the document formatting language that is used to create hyperlinked Web pages. An HTML file is a basic text file that employes "markup tags" to indicate where text formatting such as bold and italic should be applied. A typical HTML file is stored on a Web server where it is accessed by a client using a Web browser. The file is transferred from the server to the client using HTTP (Hypertext Transfer Protocol). When a Web client accesses a Web server, a specific HTML file is transferred to the client's Web browser. The HTML code is translated by the client's browser into a formatted and hyperlinked document.

HTML documents have the filename extension "HTML" or "HTM." The Web browser in which they are displayed acts as a container for text and "objects" such as images and sounds. The objects are not actually stored in the document. Instead, tags containing external reference to objects are inserted at appropriate places in the HTML text. Thus, an HTML page actually consists of the HTML file itself along with any additional references, graphics, and multimedia files. All of these objects must be online and available to a user when the file is opened.

Hyperlinks are the most important tag in the HTML scheme. A hyperlink immediately links a user to another location in the same document, to another document at the same site, or to a document at another site. Hypertext is nonlinear text. It allows you to quickly switch to a reference or another source of information with the click of a button, then jump back and continue reading where you left off. A historical perspective on hypertext is given under "Hypermedia and Hypertext."

A Very Short Introduction to HTML

As mentioned, HTML documents are plaintext documents that you can create with any word processor, even Windows Notepad. However, for large projects, it is best to use Web development applications such as Microsoft FrontPage or Macromedia Dreamweaver. These applications let developers construct Web pages in a page layout format while creating HTML code in the background. Text is formatted as with any word processor and objects such as pictures and sound are inserted on the page by simply specifying the external file to insert.

You can view the HTML code for any Web page. Open any Web page, then choose Source (or Document Source) from the View menu of your browser.

An example is presented below. The HTML tags are all the items enclosed in angle brackets (< and >). There is an opening tag and a closing tag. For example, to boldface some text, you insert the tag <B> at the beginning of the text and insert the tag </B> at the end of the text. A very simple HTML document with these tags is shown below. If you type this text in a word processor and save it as a text file with the extension HTML, the file will open as a formatted HTML page in any Web browser.

<HTML>
<HEAD>
<TITLE>Your Company Name Home Page</TITLE>
</HEAD>
<BODY>
<H1>Welcome to Company Name</H1>
<B>Text between these tags is boldfaced </B>
<I>Text between these tags is italics</I>
<UL>
<LI>This is the first bulleted item
<LI>This is the second bulleted item
<LI> This is the third bulleted item
</UL>
<A HREF="http://www.linktionary.com/">Linktionary</A>
</BODY>
</HTML>

The information between <HTML> and </HTML> contains information about the doucment that does not display in the body of the text. The title appears in the Web browsers title bar and on search results pages. Other information such as document descriptions and search keywords are placed between these tags.

The information that falls between between <BODY> and </BODY> contains is the text of the page that the end-user sees in a Web browser. Of course, the most important elements of any HTML file are its hyperlinks. The third to the last line in the above code illustrates a hyperlink to the Linktionary Web site. The hyperlink follows A HREF. The text that follows it between the brackers (Linktionary) is displayed as an underlined hyperlink when the document is displayed in a Web browser.

HTML Developments and Extensions

The World Wide Web Consortium, or W3C, is the primary organization responsible for advancing HTML. Although Microsoft, Netscape, and others have improved the standard on their own, the W3C has attempted to corral these improvements and describe them as standards. Here is a brief history of HTML versions:

  • HTML 2.0    Defined by the IETF in 1996, based on core work done in 1994. RFC 1866 (Hypertext Markup Language - 2.0, November 1995) describes this version.
  • HTML 3.2    The W3C released this in 1996. It includes tables, applets, text flow, superscripts, and subscripts.
  • HTML 4.0    The W3C released this in 1997 with a revision in 1998.
  • HTML 4.01    The W3C released this in late 1999 to fix bugs in HTML 4.0.

The W3C is now involved in recasting HTML in XML and developing a new standard called XHTML, which is described later.

Internet RFC 2854 (The 'text/html' Media Type, June 2000) summarizes the history of HTML development. It describes how previous IETF documents defining HTML are now obsolete, including RFC 1866, RFC 1867, RFC 1980, RFC, and it officially removes HTML from the IETF Standards Track, turning all responsibility for HTML development over to the W3C HTML working group. Click the Updates and New Information button above to access these documents.

Many companies and organizations are advancing the HTML standard. Microsoft and Netscape keep improving HTML to provide more features in their Web browsers. The World Wide Web Consortium (W3C) tracks these improvements and has developed many of its own.

The following is a list of important improvements that have been made to HTML in the last few years:

  • Style sheets add more presentational control for HTML. Both authors and readers can change a style sheet to alter the way information is displayed, without affecting device independence.
  • Dynamic HTML combines HTML, style sheets, and scripts, allowing developers to create dynamic Web documents with moving objects, animation, and so on.
  • A Document Object Model (DOM) has been developed that provides a way for scripts to manipulate HTML via a set of methods and data types defined independently of programming languages or platforms. DOM is now the basis of dynamic HTML. See "DOM (Document Object Model)."
  • Scripts can call up addition pages associated with a Web page, such as a table of contents. The Web server does not need to get involved in this page change.
  • Users can cut and paste on Web pages. An edited page can be saved on the original Web server with appropriate permissions.
  • Absolute positioning is a feature that locks an object to a specific position on a page (instead of just left or right).
  • Internationalization supports other writing systems and mixed-language documents.
  • Access for people with disabilities. For example, HTML can be rendered into Braille or speech.
  • HTML Math will display intricate mathematical expressions and technical notations.
  • Tables with row and column information.
  • Frames provide a separately scrollable window within a Web browser.
  • HTML for portable devices such as HDML (Handheld Device Markup Language), WAP (Wireless Access Protocol), and WML (Wireless Markup Language). See the separate topics in this book for more information.

Microsoft and Netscape have both developed so-called "dynamic HTML" models that combine style sheets, scripts, and document animations. The W3C is working to ensure that these models are interoperable and scripting-language neutral. It has developed a Document Object Model platform that allows programs and scripts to dynamically access and update the content, structure, and style of documents, and further process that document with the results incorporated back into the presented page.

Note that many of these developments were originally done with extensions to HTML. Most developers are now moving to XML (Extensible Markup Language), which is better suited to coding information like text to speech, information in tables. HTML has been great for presentation, but the Web is becoming more than just a presentation medium. It is becoming an electronic commerce medium, and the pages the people exchange must become more than just large blocks of unknown information. XML can explicitly describe a piece of information such as an address or an invoice total so other programs can extract that information automatically. See "XML (Extensible Markup Language" for more information.

The W3C's Metadata Activity Group is developing ways to model and encode metadata. The group has developed RDF (Resource Description Framework) and PICS (Platform for Internet Content Selection). A broad goal of RDF is to define a way to describe resources without being tied to any platform or application. RDF relies on XML to do this, a universal format for structuring and formatting documents. See "Metadata" for more information.

XHTML and XML

In 1999, the W3C recast HTML in XML, resulting in XHTML 1.0. XHTML is similar enough to HTML that developers will be able to make an easy transition to the world of XML. But while HTML is a specific markup language for displaying information in Web browsers (e.g., the tags are predefined for displaying information), XHTML follows a modular approach that supports the creation of documents that can be used in a variety of environments. For example, a document may have several associated templates, and each template describes how the document is displayed (or what it does) when opened by a particular device (PC with Web browser, PDA, or cell phone).

To understand XHTML, one must understand XML. First, keep in mind that HTML defines how to display information in basically one thing: a Web browser. Now suppose you want to display the same HTML document in the small display of a PDA. It's possible, but the display will not do justice to most Web pages. What is needed is another set of tags that describes how to display the same Web page in a PDA.

The beauty of this scheme is that it does away with a fixed set of elements, as was the case in HTML. New tags and document attributes can be defined by anyone at any time to fit a wide variety of devices or applications. These elements are defined in external documents called DTDs (Document Type Descriptions). DTDs define the rules, syntax, and grammer for marking up documents. For example, the medical and legal professions have defined their own document descriptions. Anyone working with the medical or legal industry can use the predefined DTD to display and work with medical or legal documents.

XHTML can be used with tags from other XML applications such as the following languages, which are being developed by the W3C:

  • SMIL (Synchronized Multimedia Integration Language) Enables simple authoring of TV-like streaming multimedia presentations such as training courses on the Web. Refer to http://www.w3.org/AudioVideo/.
  • MathML (Math Markup Language) Encodes both the presentation of mathematical notation for high-quality visual display, and mathematical content. Refer to http://www.w3.org/Math/.
  • SVG (Scalable Vector Graphics) A language for describing two-dimensional graphics in XML. Refer to http://www.w3.org/Graphics/SVG/.

XHTML modules are separate documents that extend XHTML into a variety of existing and new platforms. The modules provide a framework for defining markup languages that fit particular applications and devices. Content providers will find it easier to produce content for a wide range of platforms, with better assurances as to how the content is rendered. The concept takes into account that many different devices will be connected to the Web, including cell phones, PDAs and other devices with varying display characteristics.

There is a set of core modules that are used to mark up headings, paragraphs, lists, hypertext links, images, and other document components. Current modules include a text module (defines the tags for HTML text and paragraphs), a list module (defines tags for creating bulleted lists, numbered lists, etc.), and a hypertext module (defines tags for creating hypertext). Other modules are the structure module, client-side image map module, server-side image map module, frames module, forms module, tables module, and scripting module.

The Web sites listed below provide more information about HTML and XHTML. Also see "XML (eXtensible Markup Language)" for more information.




Copyright (c) 2001 Tom Sheldon and Big Sur Multimedia.
All rights reserved under Pan American and International copyright conventions.