- What Disks to Use
- Disk Quotas
- Disk Usage Monitoring
(this is still a draft, some items need to be fixed)
1. What Disks to Use
1.a Where to Store my Stuff
1.b Disk Configuration
All the disk space available on the cluster is mounted off a dedicated device (aka appliance or server), a NetApp filer.
The current disk configuration is as follow:
Maximum | Quotas per user | NetApp | |||
---|---|---|---|---|---|
disk | disk space |
| snapshots | ||
Disk name | capacity | soft/hard | soft/hard | enabled? | What disk shall I use? |
|
| 50/ | 1.8/2M |
| For your basic configuration files, scripts and job files - your limit is low but you can recover old stuff up to 4 weeks. |
/ |
|
| 4/5M |
| For the bulk of your storage - your limit is high, and disk space is released right away, for SAO users. |
|
| 1.8/2.0TB | 1.8/2M |
| For the bulk of your storage - your limit is high, and disk space is released right away, for non-SAO users. |
|
|
| 1/2M |
| For important but relatively small files like final results, etc. - your limit is medium, you can recover old stuff, but disk space is not released right away. For SAO users |
|
| 1.0/2.0TB | 1/2M |
| For important but relatively small files like final results, etc. - your limit is medium, you can recover old stuff, but disk space is not released right away. For non-SAO users. |
|
|
| 1/1M |
| If you need more than what you can keep in - SAO/non-SAO user should use The FIFO model (first in first out) purging has yet to be implemented as we tune the system. |
Notes
The sizes of the file systems (aka the disks) on the NetApp will "auto-grow" until they reach the listed maximum capacity, so the size shown by the command df
does always not reflect the maximum size.
To prevent the disks to fill up and hose the cluster:
- disk usage is limited to:
- the amount of disk space listed under quota per user, and,
- the number of files and directories listed under no. of files (in fact "
inodes
": the sum of number of files and number of directories).
- exceeding the soft limit produces warnings; while
- the hard limit cannot be exceeded, producing errors.
The Linux command quota
is not (yet) working with the NetApp filer. We compile a daily quota report and provide tools to query the quotas and parse the quota report. (need to insert links to these tools)
Once we secure more space for /scratch
, we will implement a FIFO (first in first out) model, where old files are deleted without warning to make space.
- There will be a minimum age limit, meaning that only files older that (let's say) 3 months will be deleted.
- We will try to keep
/scratch
from filling up by running a scrubber regularly.
2. Disk Quotas
3. NetApp Snapshots: How to Recover Old or Deleted Files.
Some of the disks on the NetApp filer have the so called "snapshot mechanism" enabled:
- This allow users to recover deleted files or access an older version of a file.
- Indeed, the NetApp filer makes a "snapshot" copy of the file system (the content of the disk) every so often and keeps these snapshots up to a given age.
- So if we enable hourly snapshot and set a two weeks retention, you can recover a file as it was hours ago, days ago or weeks ago, but only up to two weeks ago.
- The drawback of the snapshot is that when files are deleted, the disk space is not freed until the deleted files age-out. like 2 or 4 weeks later.
How to Use the NetApp Snapshots:
To recover an old version or a deleted file, foo.dat, that was (for example) in /data/genomics/frandsen/important/results/
:
- If the file was deleted:
cd /data/genomics/.snapshot/XXXX/frandsen/important/results cp -pi foo.dat /data/genomics/frandsen/important/results/foo.dat
- If you want to recover an old version:
cd /data/genomics/.snapshot/XXXX/frandsen/important/results cp -pi foo.dat /data/genomics/frandsen/important/results/old-foo.dat
- The
-p
will preserve the file creation date and the-i
will prevent overwriting an existing file. - The
XXXX
is to be replaced by either:hourly.YYYY-MM-DD_HHMM
daily.YYYY-MM-DD_0010
weekly.YYYY-MM-DD_0015
whereYYY-MM-DD
is a date specification (i.e.,2015-11-01
)
- The files under
.snapshot
are read-only:- they be recovered using
cp
,tar
orrsync
; but - they cannot be moved (
mv
) or deleted (rm
).
- they be recovered using
4. Disk Usage Monitoring
The following tools can be used to monitor disk usage
(more to come)
5. Local Disk and SSDs
(more to come)
Last Updated SGK.