Changes between Version 20 and Version 21 of net.sf.basedb.opengrid/using


Ignore:
Timestamp:
Aug 24, 2020, 2:14:31 PM (8 weeks ago)
Author:
Nicklas Nordborg
Comment:

Updated documentation with Slurm information

Legend:

Unmodified
Added
Removed
Modified
  • net.sf.basedb.opengrid/using

    v20 v21  
    296296=== Uploading data files as part of a !JobDefinition ===
    297297
    298 The `JobDefinition` that is used for submitting a job to an Open Grid Cluster has the ability to upload files that are needed for the job. This is done by calling the `JobDefinition.addFile()` method with an `UploadSource` parameter. The `UploadSource` is an interface but we have provided several implementations that wraps, for example, a `String`, a BASE `File` item or an `InputStream`.
     298The `JobDefinition` that is used for submitting a job to a cluster has the ability to upload files that are needed for the job. This is done by calling the `JobDefinition.addFile()` method with an `UploadSource` parameter. The `UploadSource` is an interface but we have provided several implementations that wraps, for example, a `String`, a BASE `File` item or an `InputStream`.
    299299
    300300Note that calling the `JobDefinition.addFile()` method doesn't start the upload immediately. The upload happens in the `OpenGridSession.qsub()` method. The file is placed in the subfolder to the `<job-folder>` that has been created for the job (the `${WD}` folder).
     
    313313}}}
    314314
    315 === Connecting to non-Open Grid servers ===
    316 
    317 Connections that are made to Open Grid Clusters are regular SSH connections. There is really nothing that is special about the connection itself. This means that it is possible to connect to more or less any server that supports SSH. It doesn't matter if the server is running an Open Grid Cluster or not. Note that servers that are defined in the `opengrid-config.xml` are expected to be Open Grid Cluster servers and the `OpenGridService` implementation will try to call Open Grid commands on them.
    318 
    319 However, it is possible to programmatically create a `ConnectionInfo` instance and use it for creating a `RemoteHost` object. With this you can connect to the server by calling the `RemoteHost.connect()` method which returns a `RemoteSession` object. It is very similar to what can be done with `OpenGridCluster`/`OpenGridSession` objects, except that the special methods for calling Open Grid Cluster commands are not available.
     315=== Connecting to non-cluster servers ===
     316
     317Connections that are made to clusters are regular SSH connections. There is really nothing that is special about the connection itself. This means that it is possible to connect to more or less any server that supports SSH. It doesn't matter if the server is running an Open Grid or Slurm cluster or something else. Note that servers that are defined in the `opengrid-config.xml` are expected to be either Open Grid or Slurm cluster servers and the `OpenGridService` implementation will try to call cluster-specific commands on them.
     318
     319However, it is possible to programmatically create a `ConnectionInfo` instance and use it for creating a `RemoteHost` object. With this you can connect to the server by calling the `RemoteHost.connect()` method which returns a `RemoteSession` object. It is very similar to what can be done with `OpenGridCluster`/`OpenGridSession` objects, except that the special methods for calling cluster commands are not available.
    320320
    321321Tip! It is possible to create a `ConnectionInfo` instance from a BASE `FileServer` item (assuming that the file server contains all required information for connecting via SSH: host, fingerprint, username and password).
    322322
    323 === Tracking non-Open Grid jobs ===
    324 
    325 Sometimes there are other things going on that are not Open Grid jobs that would be interesting to track. One example is the sequencing progress of a sequencer machine. In this case we want to know when the sequencing has been completed and then start analysis jobs (as Open Grid Cluster jobs). A simple bash script has been implemented (http://baseplugins.thep.lu.se/browser/other/pipeline/trunk/nextseq_status.sh) that checks if all result files from the sequencing are present on the file server or not. We want to run this script at regular intervals. When all data is present, we run some checks to validate the sequence data and if all seems to be good, we start the analysis pipeline. There are three steps to consider:
    326 
    327  * The sequencing process should be represented by a BASE job item as a proxy. Progress reporting need to be setup using the extension mechanism implemented in the BASE core. This need to be implemented completely by the other extension. It is not possible to re-use the setup the Open Grid package uses. In the code example below the `SequencingSignalHandler` class is assumed to take care of this.
     323=== Tracking non-cluster jobs ===
     324
     325Sometimes there are other things going on that are not Open Grid or Slurm jobs that would be interesting to track. One example is the sequencing progress of a sequencer machine. In this case we want to know when the sequencing has been completed and then start analysis jobs (as Open Grid Cluster jobs). A simple bash script has been implemented (http://baseplugins.thep.lu.se/browser/other/pipeline/trunk/nextseq_status.sh) that checks if all result files from the sequencing are present on the file server or not. We want to run this script at regular intervals. When all data is present, we run some checks to validate the sequence data and if all seems to be good, we start the analysis pipeline. There are three steps to consider:
     326
     327 * The sequencing process should be represented by a BASE job item as a proxy. Progress reporting need to be setup using the extension mechanism implemented in the BASE core. This need to be implemented completely by the other extension. It is not possible to re-use the setup the Job Scheduler package uses. In the code example below the `SequencingSignalHandler` class is assumed to take care of this.
    328328
    329329{{{
    330330#!java
    331331String barcode = ... // Something that identifies the current sequencing
    332 String clusterId = ... // The Open Grid Cluster we use to check status
     332String clusterId = ... // The cluster we use to check status
    333333
    334334// Create a new BASE job and set properties so that we can identify
     
    338338job.setPluginVersion("my-sequencing-1.0");
    339339
    340 job.setExternalId(barcode);  // Instead of the Open Grid job ID
     340job.setExternalId(barcode);  // Instead of the OpenGrid/Slurm job ID
    341341// Setup signalling for progress reporting (see BASE documentation)
    342342String signalURI = SequencingSignalHandler.getSignalUri(barcode);
     
    348348}}}
    349349
    350  * Once a request for a status update is received by the `SequencingSignalHandler` it should call `OpenGridService.asyncJobStatusUpdate(JobIdentifier, JobStatusUpdater)`. The Open Grid extension will then call the `JobStatusUpdater.getJobStatus()` during the next asynchronous processing cycle. In the example above, the `JobStatusUpdater` implementation should call the bash script to see how far the sequencing has come and then report that back in a `JobStatus` object. The Open Grid extension will take the responsibility of updating the Job item in BASE. It might be tempting to check the sequencing status directly from the signal handler, but this is not recommended since the signals may arrive quite often. The asynchronous approach is preferable and also gives you automatic updates of the Job item in BASE.
     350 * Once a request for a status update is received by the `SequencingSignalHandler` it should call `OpenGridService.asyncJobStatusUpdate(JobIdentifier, JobStatusUpdater)`. The Job scheduler extension will then call the `JobStatusUpdater.getJobStatus()` during the next asynchronous processing cycle. In the example above, the `JobStatusUpdater` implementation should call the bash script to see how far the sequencing has come and then report that back in a `JobStatus` object. The Job scheduler extension will take the responsibility of updating the Job item in BASE. It might be tempting to check the sequencing status directly from the signal handler, but this is not recommended since the signals may arrive quite often. The asynchronous approach is preferable and also gives you automatic updates of the Job item in BASE.
    351351
    352352{{{
     
    401401}}}
    402402
    403  * When the sequencing has been completed (`status == Job.Status.DONE`) the normal job completion routines in the Open Grid extension notifies all registered `JobCompletionHandler` implementations. The other extension simply need to extend the `JobCompletionHandler` implementation to be able to detect the sequencing job and then do whatever needs to be done with that.
     403 * When the sequencing has been completed (`status == Job.Status.DONE`) the normal job completion routines in the Job scheduler extension notifies all registered `JobCompletionHandler` implementations. The other extension simply need to extend the `JobCompletionHandler` implementation to be able to detect the sequencing job and then do whatever needs to be done with that.
    404404
    405405{{{
     
    430430=== Reacting to configuration changes ===
    431431
    432 The `opengrid-config.xml` is parsed and loaded into memory when the Open Grid Scheduler service extension is started. Changes to the configuration file are not applied until the service is re-started. For some extensions it may be critical to be able to detect when this happens. Luckily, everything that is needed is already built into the BASE core API. Extensions that need to know when the Open Grid Scheduler service is stoppped or started simply need to register an event handler with the manager in BASE. The event handler should listen to `SERVICE_STOPPED` or `SERVICE_STARTED` events for the `net.sf.basedb.opengrid.service` extension.
     432The `opengrid-config.xml` is parsed and loaded into memory when the Job scheduler service extension is started. Changes to the configuration file are not applied until the service is re-started. For some extensions it may be critical to be able to detect when this happens. Luckily, everything that is needed is already built into the BASE core API. Extensions that need to know when the Job scheduler service is stoppped or started simply need to register an event handler with the manager in BASE. The event handler should listen to `SERVICE_STOPPED` or `SERVICE_STARTED` events for the `net.sf.basedb.opengrid.service` extension.
    433433
    434434{{{
    435435#!java
    436436// We need a filter that listens for SERVICE_STARTED event
    437 // related to the Open Grid Scheduler service
     437// related to the Job scheduler service
    438438EventFilter serviceStarted = new ExtensionEventFilter(
    439439   "net.sf.basedb.opengrid.service", Services.SERVICE_STARTED);
     
    451451}}}
    452452
    453 Now, every time the Open Grid Scheduler service is restarted, BASE calls the `MyEventHandler.handleEvent()` method.
     453Now, every time the Job scheduler service is restarted, BASE calls the `MyEventHandler.handleEvent()` method.
    454454