1. What Disks to Use
  2. Disk Quotas
  3. Disk Usage Monitoring

 (this is still a draft, some items need to be fixed)

1. What Disks to Use

1.a Where to Store my Stuff

1.b Disk Configuration

All the disk space available on the cluster is mounted off a dedicated device (aka appliance or server), a NetApp filer.

The current disk configuration is as follow:

 MaximumQuotas per userNetApp 
 diskdisk space

no. of files

snapshots 

Disk name

capacity

soft/hard

soft/hard

enabled?

What disk shall I use?

/home

8TB

50/100GB

1.8/2M

yes: 4 weeks

For your basic configuration files, scripts and job files

- your limit is low but you can recover old stuff up to 4 weeks.

/pool/sao

60TB

1.8/2.0TB

4/5M

no

For the bulk of your storage

- your limit is high, and disk space is released right away, for SAO users.

/pool/genomics

50TB

1.8/2.0TB1.8/2M

no

For the bulk of your storage

- your limit is high, and disk space is released right away, for non-SAO users.

/data/sao

20TB

2.8/3.0TB

1/2M

yes: 2 weeks

 For important but relatively small files like final results, etc.

- your limit is medium, you can recover old stuff, but disk space is not released right away.

For SAO users

/data/genomics

10TB

1.0/2.0TB1/2M

yes: 2 weeks

For important but relatively small files like final results, etc.

- your limit is medium, you can recover old stuff, but disk space is not released right away.

For non-SAO users.

/scratch

50TB

2.8/3.0TB

1/1M

no, FIFO model

If you need more than what you can keep in /pool

- SAO/non-SAO user should use /scratch/sao or /scratch/genomics, respectively.

The FIFO model (first in first out) purging has yet to be implemented as we tune the system.

Notes

The sizes of the file systems (aka the disks) on the NetApp will "auto-grow" until they reach the  listed maximum capacity, so the size shown by the command df does always not reflect the maximum size.

To prevent the disks to fill up and hose the cluster:

The Linux command quota is not (yet) working with the NetApp filer. We compile a daily quota report and provide tools to query the quotas and parse the quota report. (need to insert links to these tools)

Once we secure more space for /scratch, we will implement a FIFO (first in first out) model, where old files are deleted without warning to make space.

 

2. Disk Quotas

 

3. NetApp Snapshots: How to Recover Old or Deleted Files.

Some of the disks on the NetApp filer have the so called "snapshot mechanism" enabled:

How to Use the NetApp Snapshots:

To recover an old version or a deleted file, foo.dat, that was (for example) in /data/genomics/frandsen/important/results/:

 cd /data/genomics/.snapshot/XXXX/frandsen/important/results
 cp -pi foo.dat /data/genomics/frandsen/important/results/foo.dat
 cd /data/genomics/.snapshot/XXXX/frandsen/important/results
 cp -pi foo.dat /data/genomics/frandsen/important/results/old-foo.dat

4. Disk Usage Monitoring

The following tools can be used to monitor disk usage

(more to come)

5. Local Disk and SSDs

(more to come)


Last Updated SGK.