CHECS members use a wide variety of computing resources to carry out their research. These resources range from a rapidly growing collection of the latest processors, network cards, switches and storage devices, to high-end platforms used for production scientific computing, large-scale scalability analyses, and collaboration with VT and external researchers. Below we give further details about some of our largest or more interesting computing facilities. We group these facilities into "experimental" and "production", corresponding roughly to whether we have (and exploit) root access to the resource.
Experimental Facilities
Imola cluster. This cluster, built by the PEARL Lab, features four 8-way nodes with dual-core AMD Opteron Socket-F processors running at 2.4 GHz. Each node is organized in a NUMA topology with 8 dual-core processor sub-nodes, 2 GB of memory per processor sub-node and a HyperTransport interconnect. The nodes are connected with GigE. The cluster features customized OS modules for power management and memory management to achieve maximum efficiency in scientific HPC workloads.
PlayStation3 cluster. Students and faculty from the SCAPE, PEARL and SyNeRG labs have built a 24-node cluster out of PS3s.
ICE cluster. The SyNeRG lab has a 9-node (36-core) ICE cluster, made up of dual-core, dual-processor AMD Opteron 2218 CPUs and used primarily for research in power-aware computing and high-performance networking.
Production Computing Facilities
System X. CHECS works closely with Virginia Tech's Advanced Research Computing facility (VT-ARC). The most powerful system available through VTARC is System X, an 1100 node (2200 processor) cluster, which was designed and built under the leadership of CHECS faculty members. Each System X node is a dual processor (64 bit, 2.3 GHz IBM PPC970) Apple G5 Xserve with 4 GB of memory and a 80GB disk, for an aggregate 4.4 terabytes of main memory and 88 terabytes of temporary storage. In addition, a 53 terabyte network-attached storage facility is available to System X users. The nodes of System X are interconnected over two communication fabrics: an Infiniband switching fabric and a Gigabit Ethernet fabric. The 2304 port Infiniband fabric provides 20 Gbps bandwidth per node with less than 8 microsecond latency, and is the primary communication fabric for parallel communication. The 1200 port switched Gigabit Ethernet fabric is used for system management and job startup. System X has a peak performance of 20.24 TeraFlops with a sustained performance of 12.25 TeraFlops. System X made its debut as the #3 most powerful supercomputer in the world in the November 2003 Top500 rankings. It is currently ranked 108th, and still ranks among the top 10 supercomputers at U.S. academic institutions.
VT-ARC shared-memory systems. VT-ARC currently has three SGI Altix systems which support shared-memory parallel applications, with 20, 64 and 128 processors respectively.
Anantham. A 200-node linux cluster is available to CHECS members for parallel code development and debugging, and to collaborators from the College of Engineering for production computational science and engineering applications. Associated most closely with the Laboratory for Advanced Scientific Computing and Applications (LASCA), the Anantham cluster includes 400 2.0GHz AMD Opteron processors, with 200 GB of memory and 2.0 terabytes of disk space. The nodes of the cluster are interconnected by fast Ethernet and a 2.56 Gb/s Myrinet network.
Ojibwa. LASCA also houses a shared memory SGI Altix 3300 with 12 nodes,
24 GB of memory, and 292 GB of disk space.
© 2006 Virginia Tech


