Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • At the end of the job script, you must add instruction to copy the results of your analysis back to /pool(or /scratch, or /data).


    If/when the results are easily identifiable, you can use the commands mv or tar,  and find, here are a few examples:


    1. Move the directory where all the results are stored and the log file, delete the rest.

      Code Block
      languagebash
      titlemoving identifiable results, delete the rest
      # move results and log file back
      cd $SSD_DIR
      mv results /pool/genomics/smart1/great/project/wild-cat/.
      mv wow.log /pool/genomics/smart1/great/project/wild-cat/.
      #
      # delete the rest
      rm -rf *


    2. Move the directory where all the results are stored and the log file, delete the input (conservative approach, in case you missed something).
      moving identifiable results, delete known input sets

       # move results and log file back
      cd $SSD_DIR
      mv results 


      Code Block
      languagebash
      # move results and log file back
      cd $SSD_DIR
      mv results /pool/genomics/smart1/great/project/wild-cat/.

      mv wow.log 
      
      mv wow.log /pool/genomics/smart1/great/project/wild-cat/.

      
      #

      
      # delete input set and other stuff

      rm 
      
      rm -rf
      input
      rm wow
       input
      rm wow.gen wow.conf


    3. Move ove using the --update flag of mv (see man mv)
      moving using --update


      Code Block
      languagebash
      
      # move results using --update

      cd $SSD
      
      cd $SSD_DIR

      mv 
      
      mv --update * /pool/genomics/smart1/great/project/wild-cat/.

      
      #

      
      # delete the rest

      rm 
      
      rm -rf *


       Note, you can use mv --update on an explicit list (of files, directories, or file specification), not just * (everything), and

      you do not have to remove the rest, but can only remove what you know you can safely remove (conservative approach).
       

    4. Find ind newer files and move them: the trick is to create a 'timestamp' file before starting the analysis.
       That file can be used later to find any newer file with the --newer= option of tar (see man tar):
      Using a timestamp file and tar --newer=


      Code Block
      languagebash
      # set the
      timestamp
      date >
       timestamp
      date > $SSD_DIR/started.txt

      
      # run the analysis

      
      ...

      
      # copy the new files in the subdir data/ to a compressed tar-ball

      cd $SSD
      
      cd $SSD_DIR

      tar 
      
      tar --newer=$SSD_DIR/started.txt -
      cfz 
      cfz /pool/genomics/smart1/great/project/wild-cat-results.tgz data/

      
      # now remove it

      rm 
      
      rm -rf data/

      
      # etc...

      
      # delete everything, unless

      rm 
      
      rm -rf *



      See previous comments and what to tar and what to remove: once you've tar'd new stuff in data/,  remove data/, etc..

    5. Using the timestamp file and  the find command (see man find):
      Using find and a timestamp file


      Code Block
      languagebash
      # set the
      timestamp
      date >
       timestamp
      date > $SSD_DIR/started.txt

      
      # run the analysis

      
      ...

      
      # find the new files in the subdir data/

      cd $SSD
      
      cd $SSD_DIR

      find data/
      
      find data/ -newer $SSD_DIR/started.txt
      -type f > 
       -type f > /tmp/list

      
      # do the same on logs/, append to the
      list
      find logs/
       list
      find logs/ -newer $SSD_DIR/started.txt
      -type f >> 
       -type f >> /tmp/list

      
      # etc...

      
      # now save what is in the list with one tar

      tar 
      
      tar --files-from=/tmp/
      list 
      list -
      cfz 
      cfz /pool/genomics/smart1/great/project/wild-cat-results.tgz data/

      
      # now remove data/ and logs/

      rm 
      
      rm -rf data/ logs/

      
      # etc...

      
      # delete everything, unless

      rm 
      
      rm -rf *

      (tick) There are many more ways to accomplish this ....



      (lightbulb) BTW, the advantage of writing a .tgz file, rather than moving files is two fold, assuming your stuff is compressible:(tick) There are many more ways to accomplish this ....

      1. You write less in the .tgz file, so it should be done faster (reading and compressing should be fast, writing is the slow step)

      2. you need less disk space for your output (since it is compressed).

...