Opened 4 years ago

Closed 3 years ago

#1278 closed enhancement (fixed)

Improve timeout handling when executing commands on remote server

Reported by: Nicklas Nordborg Owned by:
Priority: major Milestone: Job scheduler extension v1.4
Component: net.sf.basedb.opengrid Keywords:
Cc:

Description

When executing a command on a remote server there is a timeout that causes the command to be aborted and an exception to be thrown if it is exceeded.

While this works well in most cases there are some problems. See for example #1269 which is about executing a find command to find existing files. When the number of files is low the command finish quickly, but as the server fills up it will take longer and longer until the timeout is reached.

The intention about the timeout is to avoid that a command i hanging without producing results, but in this case the find command is returning data as quickly as it can. I think it would be possible to detect that and as long as data is coming in it would be nice if the timeout could be extended automatically.

Change History (3)

comment:1 by Nicklas Nordborg, 4 years ago

In 6071:

References #1278: Improve timeout handling when executing commands on remote server

Implemented functionality for extending the timeout up to 10 times if there is data coming in from the executing command.

comment:2 by Nicklas Nordborg, 4 years ago

In 6072:

References #1278: Improve timeout handling when executing commands on remote server

Added CmdResult.getHardTimeout() method to make it possible to programmatically change the hard timeout in case that should be needed.

comment:3 by Nicklas Nordborg, 3 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.