
Data storage demands are growing faster than the Internet itself. Factors
driving storage growth include large-scale Internet services and content
distribution, digital audio and video, 3D graphics and visualization, and
new sources of high-volume data in scientific and commercial domains.
The Slice project explores techniques for building unified massive-data
storage systems from inexpensive components connected by a high-speed
network. Here are some of the project's goals and premises:
- Storage at network speed.
With the emergence of Gigabit Ethernet as a commodity, the network will
become the fastest path to I/O for most systems on Ethernet networks
(LANs). A key technical challenge is to manage disks and memories across
the network as a unified resource whose performance tracks rapid advances
in network technology rather than the slower improvements in disk
speeds.
- Internet/LANs as a scalable storage backplane.
Commercial systems increasingly provide shared storage by interconnecting
storage devices and servers with dedicated Storage Area Networks (SANs),
e.g., FibreChannel. Yet recent order-of-magnitude improvements in LAN
performance have narrowed the bandwidth gap between SANs and LANs,
undermining the need for a separate SAN. LAN-based network storage systems
can approach SAN systems in terms of incremental scalability, reliability,
ease of administration, and performance, while offering high-speed storage
access to standard LAN-attached clients at lower cost.
- Decentralized file service structure.
Client/server LAN file services were dominant through the late 1980s and
1990s and up to today. Most file system functions are handled by central
file servers exporting one or more storage volumes through an
Internet-based network file system protocol (e.g., NFS). The latest
generation of network file servers are often confusingly referred to as
network-attached storage (NAS) appliances, in order to contrast with the
SAN approach. To deliver on the potential of high-speed LANs, NAS systems
must evolve away from the client/server model toward the more decentralized
service structure of the SAN systems, distributing storage functions across
a fluid collection of cooperating servers and storage nodes.
- Intelligent block placement and movement.
Overall storage system performance and reliability are determined primarily
by the policies for distributing data blocks across storage nodes, placing
blocks on disks, and timing data movement between slow disks and fast
memory. We are experimenting with policies and mechanisms for informed
prefetching, network memory caching, mirrored striping,
application-directed block placement, and storage management that adapts to
available resources.
The Slice file service is implemented as a set of loadable kernel
modules in the FreeBSD operating system. A Slice is configured as a
combination of server modules that handle specific file system functions:
directory management, raw block storage, efficient storage of small files,
and network caching. Servers may be added as needed to scale different
components of the request stream independently, to handle a range of
data-intensive and metadata-intensive workloads. Slice is designed to be
compatible with standard NFS clients, using an interposed network-level
packet translator to mediate between each client and an array of servers
presenting a unified file system view.
Personnel
- Faculty
- Students
- Research Staff
Funding
Slice is supported by the
National Science Foundation (through awards CCR-96-24857,
EIA-9972879, and EIA-9870724), and by Intel Corporation
through the Education 2000 initiative.
Chase and Vahdat are supported by NSF CAREER Young Investigator awards.