Changes between Version 3 and Version 4 of net.sf.basedb.opengrid/using


Ignore:
Timestamp:
Jan 13, 2017, 3:06:17 PM (7 years ago)
Author:
Nicklas Nordborg
Comment:

Started to write the section about "Getting notified when a job completes"

Legend:

Unmodified
Added
Removed
Modified
  • net.sf.basedb.opengrid/using

    v3 v4  
    115115Job job = Job.getNew(dc, null, null, null); // All null to create an 'OTHER' type job
    116116job.setName("My analysis");
    117 job.setPluginVersion("analysis-1.0");
     117job.setPluginVersion("my-analysis-1.0");
    118118// job.setItemSubtype(...); // This can also be useful
    119119dc.saveItem(job); // Important!!!
     
    149149== Getting notified when a job completes ==
    150150
     151One important feature is that other extensions can get notified when a job running on the cluster has ended. This is implemented in an asynchronous manner and it should not matter if the BASE server is updated or restarted or otherwise modified while a job is running. In the background there are two parts that work together to make this feature work.
     152
     153 * The BASE system for requesting job progress information about external jobs has been setup to send requests to the `OpenGridService` whenever it want new information about a job. This is the reason why it is important to create a BASE job item as a proxy for the Open Grid Cluster jobs. Without it no progress information is requested and we never get to know when the job has ended.
     154 * The `OpenGridService` is polling each registered cluster at regular intervals. Typically once every minute but it may be more or less often depending on if there are any known jobs executing or not. The `OpenGridSession.qstat()` and `OpenGridSession.qacct()` methods are used for this and will detect waiting, running and completed jobs. For running jobs, the service will download the `progress` file (see `ScriptBuilder.progress()` above) and about the information in the BASE database.
     155
     156Once a job has been detected as completed the service will invoke the job completion sequence. This is implemented as a custom extension point (`net.sf.basedb.opengrid.job-complete`) that will receive messages about completed jobs. Extensions that want to get notified should extend the extension point. Note that all registered extensions are notified about all jobs. It doesn't matter which extension that originally submitted the job to the cluster. Notifications are sent both for successful and failed jobs. Thus, each extension is responsible for filtering and ignoring notifications about jobs that is of no interest to them. This is why it is important to set name, plug-in version, etc. on the job when submitting it. We recommend that this filtering step is implemented in the `ActionFactory` that is registered for the `net.sf.basedb.opengrid.job-complete` extension point. Note that a single notification may handle more than one job. Thus, the `prepareContext()` method is called once and without any information about the jobs while the the `getActions()` method is called once for every job.
     157
     158{{{
     159public class MyAnalysisJobCompletionHandlerFactory
     160   implements ActionFactory<JobCompletionHandler>
     161{
     162       
     163   public MyAnalysisJobCompletionHandlerFactory()
     164   {}
     165
     166   @Override
     167   public boolean prepareContext(InvokationContext context)
     168   {
     169      // Always true since we do not know anything about the job(s) that have been completed
     170      return true;
     171   }
     172
     173   @Override
     174   public JobCompletionHandler[] getActions(InvokationContext context)
     175   {
     176      ClientContext cc = context.getClientContext();
     177      Job job  = (Job)cc.getCurrentItem();
     178               
     179      String pluginVersion = job.getPluginVersion();
     180      if (pluginVersion == null || !pluginVersion.startsWith("my-analysis"))
     181      {
     182         // This is not our job, ignore it
     183         return null;
     184      }
     185               
     186      // Note that job.getStatus() has not been updated yet so we
     187      // need to get the status information extracted from the cluster
     188      JobStatus status = (JobStatus)cc.getAttribute("job-status");
     189      if (status.getStatus() != Job.Status.DONE)
     190      {
     191         // We don't do anything unless the job was successful.
     192         return null;
     193      }
     194
     195      JobCompletionHandler action = null;
     196      String jobName = job.getName();
     197      if (jobName.startsWith("My analysis"))
     198      {
     199         action = new MyAnalysisCompletionHandler();
     200      }
     201      else
     202      {
     203          // In the future we may have more than one type of jobs...
     204      }
     205
     206      return action == null ? null : new JobCompletionHandler[] { new JobCompletionWrapper(action) };
     207   }
     208}
     209}}}
     210
    151211== Aborting jobs ==
    152212