Changes between Version 17 and Version 18 of net.sf.basedb.opengrid/using


Ignore:
Timestamp:
Aug 21, 2020, 2:05:25 PM (4 years ago)
Author:
Nicklas Nordborg
Comment:

Added information about Slurm to chapter about creating a job script

Legend:

Unmodified
Added
Removed
Modified
  • net.sf.basedb.opengrid/using

    v17 v18  
    6969to only list clusters with some specific properties. Predefined filter implementations can be found in the `net.sf.basedb.opengrid.filter` package:
    7070
    71  * `xxx`: Filter on type of cluster (eg. Slurm or Open Grid)
    72  * `xxx`: Only return clusters that we can connect to
    73  * `xxx`: Only return clusters that we connect to with a given username
     71 * `ClusterTypeFilter`: Filter on type of cluster (eg. Slurm or Open Grid)
     72 * `IsConnectedFilter`: Only return clusters that we can connect to
     73 * `UsernameFilter`: Only return clusters that we connect to with a given username
    7474
    7575== Creating a job script ==
    7676
    77 In it's simplest form a job script is only a string with one or more (bash) commands to execute. For example, `pwd; ls` is a valid job script that prints the current directory and then lists all files in it. To help you create longer and more complex scripts the `ScriptBuilder` class can be used. The `cmd()`, `echo()` and `comment()` methods are more or less self-describing. It is possible to start a command in the background with `bkgr()`, but note that this must be paired with a `waitForProcess()` otherwise the job script may finish before the commmand that is running in the background which may cause unpredictable results. The `ScriptBuilder.progress()` method is a very useful method for jobs that are expected to take a long time to run. The method writes progress information to the `${WD}/progress` file. This information is picked up by the Open Grid Service and reported back to the BASE job that is acting as a proxy.
     77In it's simplest form a job script is only a string with one or more (bash) commands to execute. For example, `pwd; ls` is a valid job script that prints the current directory and then lists all files in it. To help you create longer and more complex scripts the `ScriptBuilder` class can be used. The `cmd()`, `echo()` and `comment()` methods are more or less self-describing. It is possible to start a command in the background with `bkgr()`, but note that this must be paired with a `waitForProcess()` otherwise the job script may finish before the commmand that is running in the background which may cause unpredictable results. The `ScriptBuilder.progress()` method is a very useful method for jobs that are expected to take a long time to run. The method writes progress information to the `${WD}/progress` file. This information is picked up by the Job scheduler service and reported back to the BASE job that is acting as a proxy.
    7878
    7979When creating a job script you may find the following variables useful:
     
    8282 * `${TMPDIR}`: A temporary working directory that is typically only available on the node the job is running on. Unless the job is started in debug mode, this directory is deleted soon after the job has finished.
    8383 * `${NSLOTS}`: The number of slots that has been assigned to this job. If the job is starting a multi-threaded analysis program it is common practice to not use more threads than what this value specifies. Note that a single node may run more than one job at the same time and that one slot typically corresponds to one cpu core.
     84
     85**Note! ** `NSLOTS` is a variable that is set by the Open Grid software. It also sets a lot of other variables that can be used by the job script. Slurm has a different set of variables. For backwards compatibility reasons when running on a Slurm cluster, a wrapper script will set `NSLOTS=SLURM_JOB_CPUS_PER_NODE`.
    8486
    8587In the code example below we assume that we have FASTQ files stored on a file server on the network. We want to align the FASTQ files with Tophat and we have a wrapper script that sets most of the parameters. We only need to provide the number of threads and the location of the FASTQ files. After Tophat we have a second post-alignment script that does some stuff and save the result in a subdirectory (`${TMPDIR}/result`).