The Mutant/Avengers

  Cluster Statistics (Ganglia) and Reservation System

   Cluster Configuration Information

The mutant and avenger clusters are two separate clusters under the same authentication netgroup. 

Mutant cluster currently consists of 32 Dell PowerEdge Servers, along with 15TB of network mounted storage on a Dell PowerEdge 2850 and a Dell R500 (cerebro and avalon).  Of the compute servers, most (24 of them) have the following configuration:
Quad Core Xeon L5410, 2x6M cache, 2.33Ghz
1333 Mhz Front Size Bus
1TB SATA drive
Dual On-Board 10/100/1000 NICs
64 bit FC8 kernels
Storage: The Mutant cluster machines can access shared storage on CEREBRO, a Dell PowerEdge 2850 running Fedora 2 with 2GB RAM. 1.3 Terabytes of storage is available in a RAID-5 configuration. (CEREBRO is for file services only, and not for login use).  More storage is on AVALON, roughly 3.6TB mirrored. Note that the Linux machines will mount your home directories off of NFS.  To get access to per-node local storage, use /local/home/yourlogin. To get access to shared storage, use /cs/mutant/yourlogin and /cs/avalon/yourlogin.

We have another 12TB storage server coming, and the new server will take over main storage duties and be renamed AVALON. At that point, AVALON is renamed blackbird, and becomes a 32GB memory-heavy compute server.

Avengers cluster is a new cluster of 10 high memory, compute-heavy servers from HP.  All of them mount a 10TB (post-mirroring) NFS share from the file server (16GB RAM).  The Avengers cluster is designed for memory and processing heavy jobs, and is designed with that in mind.  Each server has the following configuration:

HP DL360P Gen8 Server
8-core Intel Xeon E5-2650 (20M cache, 2GHz)
192 GB DDR3
4 GigE ports, 2 10GigE ports
4TB local HDD (2TB after mirroring)

  DNS names and Node Configurations

Hulk (file server, 16GB RAM)
antman (8 core, 192GB RAM, 2x2TB mirrored)
blackwidow (8 core, 192GB RAM, 2x2TB mirrored)
captainamerica (8 core, 192GB RAM, 2x2TB mirrored)
drstrange (8 core, 192GB RAM, 2x2TB mirrored)
hawkeye (8 core, 192GB RAM, 2x2TB mirrored)
ironman (8 core, 192GB RAM, 2x2TB mirrored)
namor (8 core, 192GB RAM, 2x2TB mirrored)
quicksilver (8 core, 192GB RAM, 2x2TB mirrored)
scarletwitch (8 core, 192GB RAM, 2x2TB mirrored)
wasp (8 core, 192GB RAM, 2x2TB mirrored)
ultron (8 core, 192GB RAM, 2x2TB mirrored)

Mutants Avalon (file server, 8GB RAM, 64bit) Cerebro (file server, 8GB RAM) Blackbird (32GB RAM, 64bit) Sentinel (32GB RAM, 64bit)
colossus (4 core, 24GB RAM, 64bit)
jeangrey (4 core, 24GB RAM, 64bit)
beast (4 core, 24GB RAM, 64bit)
gambit (4 core, 24GB RAM, 64bit), twitter
psylocke (4 core, 24GB RAM, 64bit)
nightcrawler (4 core, 24GB RAM, 64bit)
jubilee (4 core, 24GB RAM, 64bit)
shadowcat (4 core, 24GB RAM, 64bit)
bishop (4 core, 24GB RAM, 64bit), twitter
havok (4 core, 24GB RAM, 64bit), twitter
marrow (4 core, 24GB RAM, 64bit)
mystique (4 core, 24GB RAM, 64bit)
whitequeen (4 core, 24GB RAM, 64bit)
sinister (4 core, 24GB RAM, 64bit)
blob (4 core, 24GB RAM, 64bit)
toad (4 core, 24GB RAM, 64bit) omegared (4 core, 24GB RAM, 64bit)
exodus (4 core, 24GB RAM, 64bit)
mastermind (4 core, 24GB RAM, 64bit)
stryfe (4 core, 24GB RAM, 64bit)
robin (8 core, 8GB RAM, 64bit) cisco professorx (8 core, 8GB RAM, 64bit, DOWN) cisco
magneto (8 core, 8GB RAM, 64bit, DOWN) cisco
phoenix (2GB), twitter iceman (2GB, ganglia) cyclops (2GB, root)
storm (2GB, root)
archangel (2GB), RETIRED
wolverine (2GB), RETIRED
sabertooth (2GB), RETIRED
cannonball (2GB), RETIRED
polaris (2GB), RETIRED
banshee (2GB), RETIRED

   Cluster Tools

Here are a couple of basic tools for working with the cluster.  Each perl script runs the same command over ssh across all machines in the cluster.  To use them, you need the nodename of each machine in the cluster.  The list of nodes is in this file here.

Note, you should first set up remote ssh access to these machines by storing your public key and authorization on your NFS home directory. To automate commands, be sure to use public-keys with null passphrases, or set up your ssh-agent on the local machine. To avoid having to "accept" ssh hostkeys from each machine, download this hostkey file and append it to your .ssh/known_hosts file.