Site home page
Get alerts when Linktionary is updated
Book updates and addendums
Get info about the Encyclopedia of Networking and Telecommunicatons, 3rd edition (2001)
Download the electronic version of the Encyclopedia of Networking, 2nd edition (1996). It's free!
Contribute to this site
Electronic licensing info
Fault Tolerance and High Availability
Note: Many topics at this site are reduced versions of the text in "The Encyclopedia of Networking and Telecommunications." Search results will not be as extensive as a search of the book's CD-ROM.
Fault tolerance and high availability is about keeping systems up and running 24 hours a day, 7 days a week, or at least keeping systems up and running with a reasonable amount of performance. Downed systems can cost an organization thousands of dollars per hour, as outlined in the following table:
*Lost revenue assumes a U.S. $1-million-per-day site where 20 percent of transactions are lost during downtime.
A fault-tolerant system is designed to keep running even after a fault has occurred. Fault-tolerant features in early network operating systems included mirrored disks, with both disks reading and writing the same information. If one disk failed, the other kept running in what is called "failover" mode. This fault tolerance was expanded to disk duplexing, in which the disks and disk controllers were duplicated. These redundant components not only provided fault tolerance, but also improved performance since disk reads could come from either disk (writes still had to be performed by both disks). Of course, fault-tolerant systems must provide more than just disk failover. Some other examples of redundant systems include the following:
This topic continues in "The Encyclopedia of Networking and Telecommunications" with a discussion of the following:
Copyright (c) 2001 Tom Sheldon and Big Sur Multimedia.