Abstract :
The RAIN project
is research collaboration between Caltech and NASA-JPL on distributed computing
and data storage systems for future space-borne missions. The goal of the project
is to identify and develop key building blocks for reliable distributed systems
built with inexpensive off-the-shelf components.
The RAIN platform
consists of a heterogeneous cluster of computing and/or storage nodes connected
via multiple interfaces to networks configured in fault-tolerant topologies. The
RAIN software components run in conjunction with operating system services and standard
network protocols. Through software-implemented fault tolerance, the system tolerates
multiple node, link, and switch failures, with no single point of failure.
The RAIN technology
has been transferred to RAIN finity, a start-up company focusing on creating clustered
solutions for improving the performance and availability of Internet data centers.
To tie these components together,
the researchers created RAIN software, which
has three components:
1. A component that stores data across distributed
processors and retrieves it even if some of the processors fail.
2. A communications
component that creates
a redundant network between multiple processors and supports a single, uniform
way of connecting to any of the processors.
3. A computing
component that automatically
recovers and restarts
applications if a processor
fails.
Myrinet switches
provide the high speed cluster message passing network for passing messages between
compute nodes and for I/O. The Myrinet switches have a few counters that can be
accessed from an ethernet connection to the switch. These counters can be accessed
to monitor the health of the connections, cables, etc. The following information
refers to the 16-port, the clos-64 switches, and the Myrinet2000 switches.
ServerNet is
a switched fabric communications link primarily used in proprietary computers
made by Tandem Computers, Compaq, and HP. Its features include good scalability,
clean fault containment, error detection and failover.
The ServerNet
architecture specification defines a connection between nodes, either processor
or high performance I/O nodes such as storage devices. Tandem Computers developed
the original ServerNet architecture and protocols for use in its own proprietary
computer systems starting in 1992, and released
the first ServerNet systems in 1995.
Early attempts
to license the technology and interface chips to other companies failed, due in
part to a disconnect between the culture of selling complete hardware / software
/ middleware computer systems and that needed for selling and supporting chips
and licensing technology.
A follow-on
development effort ported the Virtual Interface Architecture to ServerNet with
PCI interface boards connecting personal computers. Infiniband directly inherited
many ServerNet features. After 25 years, systems still ship today based on the
ServerNet architecture.
ORIGIN
1. Rain Technology developed by the California Institute
of technology, in collaboration with NASA’s
Jet Propulsion laboratory and the DARPA.
2. The name of the original research project was
RAIN, which stands for Reliable Array of Independent
Nodes.
3. The RAIN research team in 1998 formed a company
called Rainfinity.
Download :
Download :