The Mutant/Avengers
Clusters


  Cluster Statistics (Ganglia) and Reservation System

   Cluster Configuration Information

The mutant and avenger clusters are two separate clusters under the same authentication netgroup. 

Mutant cluster currently consists of 32 Dell PowerEdge Servers, along with 15TB of network mounted storage on a Dell PowerEdge 2850 and a Dell R500 (cerebro and avalon).  Of the compute servers, most (24 of them) have the following configuration:
Quad Core Xeon L5410, 2x6M cache, 2.33Ghz
1333 Mhz Front Size Bus
24GB DDR 266MHz RAM
1TB SATA drive
Dual On-Board 10/100/1000 NICs
64 bit FC8 kernels
Storage: The Mutant cluster machines can access shared storage on CEREBRO, a Dell PowerEdge 2850 running Fedora 2 with 2GB RAM. 1.3 Terabytes of storage is available in a RAID-5 configuration. (CEREBRO is for file services only, and not for login use).  More storage is on AVALON, roughly 3.6TB mirrored. Note that the Linux machines will mount your home directories off of NFS.  To get access to per-node local storage, use /local/home/yourlogin. To get access to shared storage, use /cs/mutant/yourlogin and /cs/avalon/yourlogin.

We have another 12TB storage server coming, and the new server will take over main storage duties and be renamed AVALON. At that point, AVALON is renamed blackbird, and becomes a 32GB memory-heavy compute server.

Avengers cluster is a new cluster of 10 high memory, compute-heavy servers from HP.  All of them mount a 10TB (post-mirroring) NFS share from the file server hulk.cs.ucsb.edu (16GB RAM).  The Avengers cluster is designed for memory and processing heavy jobs, and is designed with that in mind.  Each server has the following configuration:

HP DL360P Gen8 Server
8-core Intel Xeon E5-2650 (20M cache, 2GHz)
192 GB DDR3
4 GigE ports, 2 10GigE ports
4TB local HDD (2TB after mirroring)



  DNS names and Node Configurations

Avengers
128.111.73.21
Hulk (file server, 16GB RAM)
128.111.73.19
antman (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.11
blackwidow (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.12
captainamerica (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.15
drstrange (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.17
hawkeye (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.10
ironman (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.13
namor (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.18
quicksilver (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.20
scarletwitch (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.16
wasp (8 core, 192GB RAM, 2x2TB mirrored)
128.111.73.14
ultron (8 core, 192GB RAM, 2x2TB mirrored)

Mutants
128.111.40.209 Avalon (file server, 8GB RAM, 64bit)
128.111.40.244 Cerebro (file server, 8GB RAM)
128.111.40.249 Blackbird (32GB RAM, 64bit)
128.111.40.210 Sentinel (32GB RAM, 64bit)
128.111.40.215
colossus (4 core, 24GB RAM, 64bit)
128.111.40.218
jeangrey (4 core, 24GB RAM, 64bit)
128.111.40.220
beast (4 core, 24GB RAM, 64bit)
128.111.40.222
gambit (4 core, 24GB RAM, 64bit), twitter
128.111.40.224
psylocke (4 core, 24GB RAM, 64bit)
128.111.40.225
nightcrawler (4 core, 24GB RAM, 64bit)
128.111.40.226
jubilee (4 core, 24GB RAM, 64bit)
128.111.40.227
shadowcat (4 core, 24GB RAM, 64bit)
128.111.40.228
bishop (4 core, 24GB RAM, 64bit), twitter
128.111.40.239
havok (4 core, 24GB RAM, 64bit), twitter
128.111.40.223
marrow (4 core, 24GB RAM, 64bit)
128.111.40.229
mystique (4 core, 24GB RAM, 64bit)
128.111.40.230
whitequeen (4 core, 24GB RAM, 64bit)
128.111.40.231
sinister (4 core, 24GB RAM, 64bit)
128.111.40.232
blob (4 core, 24GB RAM, 64bit)
128.111.40.233
toad (4 core, 24GB RAM, 64bit)
128.111.40.234 omegared (4 core, 24GB RAM, 64bit)
128.111.40.235
exodus (4 core, 24GB RAM, 64bit)
128.111.40.236
mastermind (4 core, 24GB RAM, 64bit)
128.111.40.237
stryfe (4 core, 24GB RAM, 64bit)
128.111.40.245
robin (8 core, 8GB RAM, 64bit) cisco
128.111.40.212 professorx (8 core, 8GB RAM, 64bit, DOWN) cisco
128.111.40.213
magneto (8 core, 8GB RAM, 64bit, DOWN) cisco
128.111.40.243
phoenix (2GB), twitter
128.111.40.214 iceman (2GB, ganglia)
128.111.40.217 cyclops (2GB, root)
128.111.40.221
storm (2GB, root)
128.111.40.216
archangel (2GB), RETIRED
128.111.40.219
wolverine (2GB), RETIRED
128.111.40.238
sabertooth (2GB), RETIRED
128.111.40.240
cannonball (2GB), RETIRED
128.111.40.241
polaris (2GB), RETIRED
128.111.40.242
banshee (2GB), RETIRED


   Cluster Tools

Here are a couple of basic tools for working with the cluster.  Each perl script runs the same command over ssh across all machines in the cluster.  To use them, you need the nodename of each machine in the cluster.  The list of nodes is in this file here.

Note, you should first set up remote ssh access to these machines by storing your public key and authorization on your NFS home directory. To automate commands, be sure to use public-keys with null passphrases, or set up your ssh-agent on the local machine. To avoid having to "accept" ssh hostkeys from each machine, download this hostkey file and append it to your .ssh/known_hosts file.