Opened 29 hours ago

Closed 7 hours ago

#1655 closed defect (fixed)

Sequencing job is registered as failed

Reported by: Nicklas Nordborg Owned by:
Priority: critical Milestone: Job scheduler extension v1.16
Component: net.sf.basedb.opengrid Keywords:
Cc:

Description

The error message is:

sacct: fatal: Bad job/step specified: HF2FHBGYW

A sequencing job is registered as a non-grid job to begin with and it should not use squeue or sacct to get information about it. There are custom scritpts in Reggie (eg. nextseq_status.sh that should be used instead).

Exactly how this is happening needs to be investigate. It is most likely related to #1646.

The effect of this bug is that the auto-confirmation in Reggie is not working and the sequencing need to be manually registered as ended.

Change History (3)

comment:1 by Nicklas Nordborg, 27 hours ago

It seems like this issue is related to changing the signal handler for a job when the status changes. This change is needed by #1646 since different signals are supported when a job is waiting, paused or executing.

But the signal handler is changed also for non-grid jobs which leads to the system thinks that it is a grid-job for the next update and then it fails because grid jobs have a numeric id.

comment:2 by Nicklas Nordborg, 27 hours ago

In 8056:

References #1655: Sequencing job is registered as failed

I think this should fix the issue, but it can't be tested until the next sequencing run.

comment:3 by Nicklas Nordborg, 7 hours ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.