Fixing Jenkins Job Waiting In Queue

Fixing Jenkins Job Waiting In Queue

I was trying to solve a weird issue in Jenkins when jobs were stuck in queue for a long time. When jobs were in the queue, there were idle executors available. The reason for waiting was Finished waiting. These jobs were triggered by a service account through Jenkins API. If the same job was triggered by a normal user, then the job would be executed immediately.

Finding the Cause

To find out the root cause, I clone Jenkins source code at GitHub. I started by searching the message Finished waiting in the source code. This message was defined in the resource bundle with key Queue.FinishedWaiting. This bundle key was referenced in the hudson.model.Queue class. It simply means the job was blocked.

Looking at the surrounding WaitingItem class, this class represents an item waiting in the queue. It has enter() and leave() methods. It seems that the waiting item was added in the queue, but was only removed from the queue for a long time.

Searching the usage of enter() method, the item was added in scheduleInternal() method. Searching the usage of leave() method, the item will be removed in the maintain() method.

This maintain() method maintains the queue and moves projects between different states. Jenkins internally invokes this method by itself whenever there is a change that can affect the scheduling. In the scheduleInternal() method, after adding an item to the queue, it calls scheduleMaintenance() method to submit a Runnable task to call the maintain() method.

So the cause of a job waiting in the queue may be the submitted task didn't see the waiting job. So the job has to wait until next time the maintain() method was invoked.

Solution

I don't know exactly why this could happen. My solution is to manually call scheduleMaintenance() method of Queue class to force the check. scheduleMaintenance() method is public, so I can simply call this method on a Queue instance.

Jenkins has a Script Console to run Groovy scripts. In Groovy script, I can get the instance of the Jenkins queue and call the scheduleMaintenance() method.

Jenkins.instance.queue.scheduleMaintenance()

To automate this, I used a HTTP client to send a POST request to <jenkins_url>/computer/(built-in)/script. The POST request was send using application/x-www-form-urlencoded content type. The request has one parameter script with value Jenkins.instance.queue.scheduleMaintenance().

After trigging a new job in Jenkins, I use the HTTP client to call the scheduleMaintenance() method through Script Console. No jobs in the waiting queue any more.

© 2023 VividCode