- The cluster, known as
Hydra, is made of
- two login nodes,
- one front-end node,
- a queue manager/job scheduler (the UNIVA Grid Engine or UGE), and
- a slew of compute nodes.
- From either login node you submit and monitor your job(s) via the queue manager/scheduler.
- The queue manager/job scheduler is the Grid Engine, simply GE or UGE.
- The Grid Engine runs on the front-end node (
hydra-5.si.edu), hence the front-end node should not be used as a login node.
- There is no reason for users to ever have to log on Hydra-5.
- All the nodes (login, front-end and compute nodes) are interconnected
- via Ethernet (at 10Gbps, aka 10GbE), and
- via InfiniBand (at 40Gbps or higher, aka IB).
- The disks are mounted off 3 types of dedicated devices:
- The A NetApp filer for
- A GPFS for
- A NAS for
/store(via 10GbE), a near-line storage only available on some nodes.
- The A NetApp filer for
The cluster runs a Linux distribution that is specific to clusters: it is called Rocks, and we run version 6.3, and CentOS 6.9.. We use BirghtCluster (8.2) to deploy CentOS 7.6 (Core).
- As for any Unix system, you must properly configure your account to access the system resources.
~/.profile, and or
~/.cshrcfiles need to be adjusted accordingly.
- The configuration on the cluster is different from the one on the
HEA-managed machines (for SAO users).
We have implemented the command
moduleto simplify the configuration of your Unix environment.
- You can look in the directory
~hpc/for examples of configuration files (with
ls -la ~hpc).
- GNU compilers (
gcc, g++, gfortran, g90)
- Intel compilers and the Cluster Studio (debugger, profiler, etc:
ifort, icc, ...)
- Portland Group (PGI) compilers and the Cluster Dev Kit (debugger, profiler, etc:
pgf77, pgcc, ...)
- We have 128 run-time licenses for IDL, GDL is and FL are available too.
- Tools like MATLAB, JAVA, PYTHON, R. Julia, etc... are available; and
- the Bioinformatics and Genomics support group has installed a slew of packages.
The cluster is located in Herndon, VA and is managed by ORISORCS/OCIO (Office of Research Information Computing Services/Office of the Chief Information Officer).
- DJ Ding (DingDJ@si.edu), the the system administrator (at OCIO, Herndon, VA).
- As the sys-admin, he is responsible to keep the cluster operational and secure.
- Rebecca Dikow (DikowR@si.edu) provides Bioinformatics and Genomics support (Data Science Lab/OCIO, Washington, D.C.). She ;
- she is the primary support person for Bioinformatics and Genomics at SI.
- Matthew Kweskin (KweskinM@si.edu) - NMNH/L.A.B., IT specialist (Washington, D.C.).
- Sylvain Korzennik (firstname.lastname@example.org), an astronomer at SAO (Cambridge, MA). He ;
- he is the primary support person for astronomers at SAO.
Support is also provided by other OCIO staff members (networking, etc...).
- For sys-admin issues: (forgotten password, something is not working any more, etc):
- All users should contact Jamal DJ (and Rebecca & Sylvain) at SI-HPC-Admin@si.edu
- For application support: (how do I do this?, why did that fail?, etc):
- Password problems: go to the self-serve password page.
- Please use these email addresses to let the SI/HPC support team address your issues as soon as possible,
- rather than emailing individuals directly.
A mailing list, called
HPPC-L on SI's listserv (i.e., at
HPCC-L@si-listserv.si.edu) is available to contact all the users of the cluster.
- replies to these messages are by default broadcast to the entire list; and
- you will need to set up a password on this
listservthe first time you use it (look in the upper right, under "Options").
Last updated 05 Sep SGK.