Site home page
(news and notices)

Get alerts when Linktionary is updated

Book updates and addendums

Get info about the Encyclopedia of Networking and Telecommunicatons, 3rd edition (2001)

Download the electronic version of the Encyclopedia of Networking, 2nd edition (1996). It's free!

Contribute to this site

Electronic licensing info

 

 

File Systems

Related Entries    Web Links    New/Updated Information

  
Search Linktionary (powered by FreeFind)

Note: Many topics at this site are reduced versions of the text in "The Encyclopedia of Networking and Telecommunications." Search results will not be as extensive as a search of the book's CD-ROM.

A file system provides persistent storage of information. It is the part of an operating system that interfaces with storage systems and provides a way to organize how information is stored. Users access files through command-line or graphical user interfaces.

Local file systems allow users to access storage on their own computers. However, most operating systems include peer-to-peer file-sharing functions that let users access files on other network computers or share (publish) files on their own computers. See "File Sharing" for a description and list of Web services that offer Web-based collaborative information exchange, file storage, and distributed file-sharing services.

File systems are organized into tree-structured directories. The metaphor is usually file cabinets (drives) and folders (directories). Folders are like containers that can hold other folders or files. Directories have rights or permissions such as read-only, read-write, and so on. These are assigned by the owner or network administrator. The rights and permissions of top-level folders are passed down to sub-folders. This is called inheritance. Files also have their own set of attributes, depending on the operating system.

Common file systems are briefly described here.

  • FAT (file allocation table)    The IBM/Microsoft DOS-based file system that is also used by Windows 9x versions. Windows NT also supports FAT, as well as NTFS (New Technology File System). FAT divides hard disks into one or more partitions that become drive letters, such as C:, D:, and so on. Disks are formatted into sectors, and sectors are grouped into clusters of from 4 to 32 sectors at the user's discretion. A FAT entry describes the location of files or parts of those files on the disk.

  • FAT32    Windows 95 (release 2) and later versions of Windows 9x provide this update to FAT that allows for a default cluster size as small as 4K, as well as support for hard disk sizes in excess of 2GB.

  • HPFS (High-Performance File System)    This file system was first introduced with OS/2 when Microsoft was working on the project with IBM. It supports large hard drives, supports extended names, and has more file security features. HPFS organizes directories like FAT, but adds features that improve performance. A design goal was to have HPFS allocate as much of a file in contiguous sectors as possible to increase speed.

  • NTFS (New Technology File System)    NTFS is the file system for Windows NT. It builds on the features of FAT and HPFS, and adds new features or changes. NTFS provides advanced security features and better performance, especially for server operations. NTFS is a recoverable file system, meaning that it keeps track of transactions against the file system. See "Microsoft Windows File Systems" for more information.

  • NTFS 5    NTFS 5 is the file system for Windows 2000. It has most of the features of NTFS, including support for FAT and FAT32 file systems. It also provides complete content indexing and built-in hierarchical storage management. A major new feature is dynamic volume management, which allows for live configuration changes without rebooting. There are also advanced backup, restore, and disaster recovery tools. NTFS 5 also supports I2O, IEEE 1394, and Fibre Channel. NTFS 5 may also participate in Windows 2000 DFS (Distributed File System), which provides load balancing, fault tolerance, and replication services. See "Microsoft Windows File Systems" for more information.

  • NetWare UFS (Universal File System)    UFS is the file system for NetWare 3.x, a server-based operating system. All of its features are enhanced to provide high performance to multiple users. It includes such features as elevator seeking, background writes, overlapped seeks, Turbo FAT, file compression, and block suballocation. See "Novell NetWare File System" for details.

  • NWFS (NetWare File System)    This Novell file system appeared in NetWare 4.1 and NetWare 4.11. It provides backward compatibility with previous file systems and supports loadable modules that support other file systems such as Windows 9x, OS/2, Windows NT, UNIX NFS (Network File System), and Apple Macintosh file systems. The maximum number of files supported is 16 million, with a maximum file size of 4 gigabytes. It also supports over 100 levels of directories. Other features include file compression, block suballocation, file salvage features, hot fix (data is redirected out of corrupted sectors), the ability to span volumes across 32 disks, a transaction tracking system to recover from failed transactions, mirroring, duplexing, and data migration (hierarchical storage management). File system security and quota information is stored in NDS (Novell Directory Services).

  • NSS (Novell Storage Services)    This is the follow-up to NWFS that appeared with NetWare 5.0. It stores billions of files, and the maximum file size is 8 terabytes. Name space support for other operating systems is built in. Some features in NWFS were removed to improve performance, or because they were deemed unnecessary or considered add-ons. These include compression, block suballocation, transaction tracking, disk mirroring, and data migration.

  • UNIX file system    The UNIX file system is based on the hierarchical directory tree structure like the file systems previously mentioned. The original file system was not specifically designed for remote file sharing, but these features were added later with NFS (Network File System), RFS (Remote File System), and AFS (Andrews File System). These network file systems are covered under their own headers. The UNIX file system maintains a set of attributes for each of its files. The attributes are stored in a structure called an inode (index node), which is stored on disk. The attributes include information about the type of file, its size, the device where it is located, and an inode number that uniquely identifies the file on disk. Other information included is the ID of the owner, timestamps, and permissions (read, write, and execute). See "UNIX File System."

  • WAFL (Write Anywhere File Layout)    Network Appliance Corp. designs NAS (network-attached storage) devices. It created WAFL to provide a way to store files in a multiprotocol format. Files in this format can be shared via NFS, CIFS, or HTTP. WAFL essentially frees files from the restrictions imposed by specific operating systems and the file systems they use. With WAFL, users can access files no matter which operating system they use or how the files are stored. WAFL implements Snapshots, which are read-only clones of the active file system. WAFL uses a copy-on-write technique to minimize the disk space that Snapshots consume. WAFL also uses Snapshots to eliminate the need for file system consistency checking after an unclean shutdown. WAFL provides very high performance, and supports RAID and quick restart, even after an unclean shutdown. See "NAS (Network Attached Storage)" and "Network Appliances."

The Network Appliance Web site has an extensive set of documents about all types of file systems. The Web site is given on the related entries page.

As already mentioned, file sharing is an important aspect of network-connected systems. File-sharing systems take advantage of the underlying file systems just mentioned. For example, NFS is a file-sharing system that runs with existing UNIX file systems. Likewise, Microsoft's SMB (Server Message Blocks) takes advantage of FAT and NTFS, as does the newer CIFS (Common Internet File System). There are peer-to-peer and dedicated file-sharing systems:

  • Peer-to-peer sharing    Users share files on their own workstations with other peer network users. The users are in control of the file and directory management, or it may be controlled by administrators.

  • Dedicated server sharing    In this scheme, a dedicated and secure server running a network operating system such as Novell NetWare or Windows NT/Windows 2000 provides file services that are controlled by a network administrator. This scheme provides a high level of granular control over file access.

Distributed File Systems

Distributed file systems store files on multiple servers, replicate files among those servers, and present users with a single view of all the servers. Files are accessible to users by filename without regard to the physical location of the file. As an analogy, think of a city library system in which the book catalog at each library lists all the books available at libraries throughout the city. You can order any book and it will be delivered from its current location. There is one library catalog system that provides a list of all the books available, no matter what their physical location. A distributed file system provides a single "catalog" view of files on your network, no matter where those files are located. Distributed file systems automatically replicate files to mirror servers so users can access files on servers that are close to them.

Some of the most common distributed files systems are described here.

  • AFS (Andrew File System)    AFS was developed by the Information Technology Center at Carnegie Mellon University, but is currently managed by Transarc Corporation. AFS has some enhancements that NFS does not. See "AFS (Andrew File System)."

  • DFS (Distributed File System), DCE    DFS is a version of AFS. It serves as the file system component in the Open Software Foundation's DCE (Distributed Computing Environment). See "DCE (Distributed Computing Environment)" and "DFS (Distributed File System)."

  • Microsoft DFS (Distributed file system)    Windows NT/Windows 2000 includes Microsoft's new hierarchical distributed file system. DFS is a true distributed file system that lets administrators create custom hierarchical trees that group file resources from anywhere in the organization. See "DFS (Distributed File System)."

  • NCP (NetWare Core Protocol)    NCP is NetWare's proprietary set of service protocols that the operating system follows to accept and respond to service requests from clients and other servers. It includes services for file access, file locking, security, resource tracking, and other network-related features. See "NCP (NetWare Core Protocol)."

  • NFS (Network File System)    NFS was originally created by Sun Microsystems, Inc., as a file-sharing system for TCP/IP networks. NFS is running on millions of systems, ranging from mainframes to personal computers. See "NFS (Network File System)."

  • SMB (Server Message Blocks)    SMB is Microsoft's traditional shared-file system that runs on Windows 3.x, Windows 95, and Windows NT platforms. An independently developed version of SMB called Samba is also available for non-Windows systems. See "SMB (Server Message Blocks)" and "Samba."

  • DAFS (Direct Access File System)    DAFS is a shared-file access protocol designed to work in SAN (Storage Area Network) environments, in which VI architecture is the underlying transport mechanism. DAFS is primarily designed for clustered, shared-file network environments, in which a limited number of server-class clients connect to a set of file servers via a dedicated high-speed network. With DAFS and VI, data consumers have direct access to disks across the network and can transfer data from remote disks directly into their own memory. There is no need to copy data to or from intermediate buffers, or to interrupt an operating system during file transfers.

Several additional distributed file-sharing protocols have been developed for the Web. Traditionally, when a Web client connects with a Web server, a Web page is downloaded to the user's computer. This may require a series of connections and reconnections until the document is completely downloaded. The first connection downloads text, and subsequent connections download graphics and other page elements. New Web-based distributed file systems are designed to download all the related files with a single connection, thus improving performance. The two competing Web file systems are briefly described here:

  • SunSoft's WebNFS    Implements all the features of NFS and is optimized to run over the Internet or intranets. Also provides a way to implement file security mechanisms over the Web. See "NFS (Network File System)."

  • Microsoft's CIFS (Common Internet File System)    CIFS is an extension of Microsoft's SMB (Server Message Blocks) file protocol. Like WebNFS, it is optimized to run over the Internet or intranets and implements file-level security mechanisms. See "CIFS (Common Internet File System)."

Two interesting schemes for finding and locating files on the Internet are discussed under "Handle System" and "URN (Uniform Resource Name)." Also see "Search and Discovery Services."

Distributed File System Features

A distributed file system should provide clients with access to files no matter where they are located. Traditionally, file servers have been located throughout an organization (in departments and workgroups), and users have had to locate servers that held files of interest by searching, by referrals in applications or files, or by word of mouth.

Directory services change that. Administrators can use directory services to group files and file storage systems in a hierarchical tree under branches that make sense to people. For example, an administrator could create a branch of the tree called "White Papers," and then create links to all the directories on all the servers in the organization that contain white papers. Users only need to open the White Papers section in the directory tree to access files, rather than accessing a particular server. Novell's NDS (Novell Directory Services) and Microsoft's DFS provide these features.

A distributed file system should also provide replication. If users throughout the organization require access to files on a particular server, it makes sense to replicate those files to a server that is closer to the user, especially if they are at remote offices. Replication can also minimize traffic on servers by distributing the load to other servers. A distributed file system should be able to retrieve a file from a server in a replicated set that is most available to handle the request or closest to the user making the request.

A distributed file system should also implement single sign-on so users do not need to enter a password every time they access a file on a connected or replicated system. One additional feature is encryption. If a user is going to access a sensitive file from a secure server, the transmission of the file should be encrypted, especially if the file is being transmitted over the Internet.

A common problem with any shared file system is that multiple users will need to access the same file at the same time. Concurrency controls are required to arbitrate multiuser access to files. These controls take the following forms:

  • Read-only sharing    Any client can access a file, but not change it. This is simple to implement. Web servers do this.

  • Controlled writes    In this method, multiple users can open a file, but only one user can write changes. The changes written by that user may not appear on the screens of other users who had the file open.

  • Concurrent writes    This method allows multiple users to both read and write a file simultaneously. The operating system must continuously monitor file access to prevent overwrites and ensure that users receive the latest updates.

Shared file systems differ in the way they handle concurrent writes. When a client requests a file (or database records) from a server, the file is placed in a cache at the client's workstation. If another client requests the same file, it is also placed in a cache at that client's workstation. As both clients make changes to the file, technically, three versions of the file exist (one at each client and one at the server). There are two methods for maintaining synchronization among the versions:

  • Stateless systems    In stateless systems, the server does not keep information about what files its clients are caching. Therefore, clients must periodically check with the server to see if other clients have changed the file they are caching. NFS is a stateless system.

  • Call-back systems    In this method, the server retains information about what its clients are doing and the files they are caching. The server uses a call-back promise technique to inform clients when another client has changed a file. This method produces less network traffic than the stateless approach. AFS is a call-back system. As clients change files, other clients holding copies of the files are called back and notified of changes.

There are performance advantages to stateless operations, but AFS retains some of these advantages by making sure that it does not become flooded with call-back promises. It does this by discarding callbacks after a certain amount of time. Clients check the expiration time in call-back promises to ensure that they are current. Another interesting feature of the call-back promise is that it provides a guarantee to a client that a file is current. In other words, if a cached file has a call-back promise, the client knows the file must be current unless the server has called to indicate the file changed at the server.




Copyright (c) 2001 Tom Sheldon and Big Sur Multimedia.
All rights reserved under Pan American and International copyright conventions.