Fixing Jenkins Job Waiting In Queue
I was trying to solve a weird issue in Jenkins when jobs were stuck in queue for a long time. When jobs were in the queue, there were idle executors available. The reason for waiting was Finished waiting
. These jobs were triggered by a service account through Jenkins API. If the same job was triggered by a normal user, then the job would be executed immediately.
Finding the Cause
To find out the root cause, I clone Jenkins source code at GitHub. I started by searching the message Finished waiting
in the source code. This message was defined in the resource bundle with key Queue.FinishedWaiting
. This bundle key was referenced in the hudson.model.Queue
class. It simply means the job was blocked.
Looking at the surrounding WaitingItem
class, this class represents an item waiting in the queue. It has enter()
and leave()
methods. It seems that the waiting item was added in the queue, but was only removed from the queue for a long time.
Searching the usage of enter()
method, the item was added in scheduleInternal()
method. Searching the usage of leave()
method, the item will be removed in the maintain()
method.
This maintain()
method maintains the queue and moves projects between different states. Jenkins internally invokes this method by itself whenever there is a change that can affect the scheduling. In the scheduleInternal()
method, after adding an item to the queue, it calls scheduleMaintenance()
method to submit a Runnable
task to call the maintain()
method.
So the cause of a job waiting in the queue may be the submitted task didn't see the waiting job. So the job has to wait until next time the maintain()
method was invoked.
Solution
I don't know exactly why this could happen. My solution is to manually call scheduleMaintenance()
method of Queue
class to force the check. scheduleMaintenance()
method is public
, so I can simply call this method on a Queue
instance.
Jenkins has a Script Console to run Groovy scripts. In Groovy script, I can get the instance of the Jenkins queue and call the scheduleMaintenance()
method.
Jenkins.instance.queue.scheduleMaintenance()
To automate this, I used a HTTP client to send a POST
request to <jenkins_url>/computer/(built-in)/script
. The POST
request was send using application/x-www-form-urlencoded
content type. The request has one parameter script
with value Jenkins.instance.queue.scheduleMaintenance()
.
After trigging a new job in Jenkins, I use the HTTP client to call the scheduleMaintenance()
method through Script Console. No jobs in the waiting queue any more.