Real-Time Operating Systems.
A team of researchers and us had worked for over three years on developing a high performance computation platform for data intensive applications. The platform consists of middleware layer connected to a cluster of reconfigurable active solid-state drive (RASSD) nodes consisting of a tightly-coupled FPGA/SSD pair. The FPGA implements a storage-compute node that uses partial dynamic reconﬁguration to implement hardware accelerators that process data streaming from the SSD. We showed that a RASSD node is 1.3 – 15.2 times faster than a commodity, multi-core, computer, and consumes 9.4 – 201.9 times less energy.
we were mainly responsible on the design and implementation of the RASSD storage-compute node and its firmware ( RASSD OS ) .
A RASSD node consists of an FPGA that is tightly coupled to a solid-state drive (SSD). A block diagram of a RASSD node is shown below, which we implemented in a Xilinx XC6VLX240T Virtex 6 FPGA on a ML605 development board. We also used a 60 GB OCZ Vertex Plus R2 SSD, which we connected to the ML605 board through a SATAII port on a Xilinx FMC XM104 connectivity card.
The FPGA includes a MicroBlaze soft processor core, a number of peripheral controllers, and a partial reconﬁgurable region (PRR). The PRR is a user-deﬁned region of the FPGA logic fabric used to implement hardware accelerators. The MicroBlaze processor is used to execute a light-firmware (RASSD OS) that performs supervisory functions such as communicating with middleware servers, initiating data transfers between the Ethernet MAC or SSD and DDR3 memory, and loading hardware conﬁguration bitstreams into the PRR. The MicroBlaze processor is also used to execute accelerator driver codes called drivelets, which enable it to communicate with a hardware accelerator in the PRR through a pair of fast simplex links (FSLs). The FSLs are mainly used for initializing a hardware accelerator or reading status information from it. The MicroBlaze processor connects to the external DDR3 memory through its instruction and data cache links (IXCL and DXCL). A block RAM is attached to the processor over a local memory bus (LMB) to provide local instruction and data caching. Finally, a timer is also attached to the MicroBlaze processor over a LMB to measure the execution times of speciﬁed code blocks. The MicroBlaze processor operates at 150 MHz. The RASSD node also includes a multi-ported memory controller (MPMC) that provides the MicroBlaze processor, the PRR, the Ethernet MAC, and the SATA II controller with access to the 512 MB, external DDR3 memory. The PRR is connected to the MPMC through a 64-bit native port interface (NPI) personality interface module (PIM) operating at 200 MHz. Other peripheral controllers in the RASSD node include a PLB timer for supporting the multi-tasking RASSD OS, an interrupt controller, and a HWICAP controller for conﬁguring the PRR with hardware accelerator partial bitstreams.
RASSD OS is a three layered software platform (illustrated in the below figure). The first layer provides services of the RASSD OS, which the middleware talks and communicates with the hardware through. These services rely on the second layer, called Xilkernel, which is a small, robust, and modular kernel that provides OS services, such as file system, thread management, scheduling, and so forth. The LibXil Drivers and the Standalone is the third layer and forms the lowermost hardware abstraction layer, the second and third layer together are called the BSP (Board support package)
RASSD OS provides a set of services for the middleware server which hides the low-level details of the node’s hardware architecture. The services provide routines to execute the following: Access the data stored on the node. Receive, store and load hardware accelerators or drivelets. Retrieve results obtained by running the accelerators and/or drivelets on the node.