Project

General

Profile

Bug #11295

Test jobs sometimes get scheduled on a busy isotester while there are available ones

Added by bertagaz about 3 years ago. Updated 3 months ago.

Status:
Confirmed
Priority:
Normal
Assignee:
Category:
Continuous Integration
Target version:
Start date:
03/31/2016
Due date:
% Done:

0%

Feature Branch:
Type of work:
Research
Blueprint:
Starter:
No
Affected tool:

Description

While investigating #10601, we discovered that sometimes after a reboot_job completed, rather than starting the test job that triggered it for this isotester, Jenkins assigns this same isotester to another test job, resulting in the first test job waiting for hours for the other one to be over. See #10601#note-5 for details.


Related issues

Related to Tails - Bug #10215: Suboptimal advance booking of Jenkins slaves for testing ISOs Resolved 09/17/2015
Related to Tails - Bug #10601: isotesterN:s sometimes disappear In Progress 11/23/2015
Blocked by Tails - Bug #10068: Upgrade to Jenkins 2.x, using upstream packages In Progress 01/08/2018

History

#1 Updated by intrigeri about 3 years ago

  • Description updated (diff)

I suggest to first set up a very simple test case to confirm what's the deal with job priority, and whether our current configuration is based on a correct understanding of how the priority sorter plugin works (#10601#note-5 has more precise pointers about where this doubt of mine comes from).

Rationale: even if the bug isn't obvious in our current setup for some reason, I'd rather not keep config designed based on erroneous assumptions, since if it's the case it'll be confusing in the future next time I have to debug weird race conditions again.

#2 Updated by bertagaz about 3 years ago

  • Target version changed from Tails_2.4 to Tails_2.5

#3 Updated by bertagaz about 3 years ago

  • Target version changed from Tails_2.5 to Tails_2.6

Probably won't have time to work on it before that.

#4 Updated by intrigeri almost 3 years ago

  • Subject changed from Test jobs sometimes get their isotester stolen by another one. to Test jobs sometimes get their isotester stolen by another one

I've just seen something similar happen again: https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-11588-usb-on-jenkins-10733/15/ is "(pending—Waiting for next available executor on isotester2) UPSTREAMJOB_BUILD_NUMBER=15" while https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-11588-usb-on-jenkins-10733/14/ is running on isotester2. Five other isotesters are available, so it's a shame that job 15 was scheduled on isotester2 as well and now has to wait for 3 hours before it's run.

Job 14 was run on Jul 31, 2016 9:18:49 PM by https://jenkins.tails.boum.org/job/wrap_test_Tails_ISO_test-11588-usb-on-jenkins-10733/14/, which also run
https://jenkins.tails.boum.org/job/reboot_job/8542/ with parameter RESTART_NODE=isotester2. The wrap job had NODE_NAME=isotester2.

Job 15 was run on Jul 31, 2016 9:19:19 PM by https://jenkins.tails.boum.org/job/wrap_test_Tails_ISO_test-11588-usb-on-jenkins-10733/15/, which also run https://jenkins.tails.boum.org/job/reboot_job/8543/ with parameter RESTART_NODE=isotester2. The wrap job had NODE_NAME=isotester2.

As said in the ticket description, I've already investigated such a problem 8 months ago (#10601#note-5), so the next debugging steps should be easy, if done before the corresponding system logs and Jenkins artifacts expire.

I believe this clearly answers the "We first need to see if this still happens or not" part of this ticket: something is wrong with our job priority setup.

#5 Updated by intrigeri almost 3 years ago

  • Subject changed from Test jobs sometimes get their isotester stolen by another one to Test jobs sometimes get scheduled on a busy isotester while there are available ones

Same thing as we speak, between https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_feature-from-intrigeri-for-2.6/7/ and job 9 on the same project: here again, 2 isotesters are free but job 9 is waiting for isotester1 to be available, while job 7 is running there.

#6 Updated by intrigeri almost 3 years ago

  • Related to Bug #10215: Suboptimal advance booking of Jenkins slaves for testing ISOs added

#7 Updated by anonym over 2 years ago

  • Target version changed from Tails_2.6 to Tails_2.7

#8 Updated by bertagaz over 2 years ago

  • Target version changed from Tails_2.7 to Tails_2.9.1

#9 Updated by anonym over 2 years ago

  • Target version changed from Tails_2.9.1 to Tails 2.10

#10 Updated by intrigeri over 2 years ago

  • Target version changed from Tails 2.10 to Tails_2.11

#11 Updated by bertagaz over 2 years ago

  • Target version changed from Tails_2.11 to Tails_2.12

#12 Updated by bertagaz over 2 years ago

  • Target version changed from Tails_2.12 to Tails_3.0

#13 Updated by bertagaz about 2 years ago

  • Target version changed from Tails_3.0 to Tails_3.1

#14 Updated by bertagaz about 2 years ago

  • Target version changed from Tails_3.1 to Tails_3.2

#15 Updated by bertagaz almost 2 years ago

  • Target version changed from Tails_3.2 to Tails_3.3

#16 Updated by bertagaz over 1 year ago

  • Target version changed from Tails_3.3 to Tails_3.5

Realistically reschedule for 3.4.

#17 Updated by bertagaz over 1 year ago

  • Target version changed from Tails_3.5 to Tails_3.6

#18 Updated by bertagaz over 1 year ago

  • Target version changed from Tails_3.6 to Tails_3.7

#19 Updated by intrigeri about 1 year ago

  • Description updated (diff)

#20 Updated by intrigeri about 1 year ago

  • Related to Bug #10601: isotesterN:s sometimes disappear added

#22 Updated by bertagaz about 1 year ago

  • Target version changed from Tails_3.7 to Tails_3.8

#23 Updated by intrigeri 12 months ago

  • Target version changed from Tails_3.8 to Tails_3.9

#24 Updated by intrigeri 10 months ago

  • Target version changed from Tails_3.9 to Tails_3.10.1

#25 Updated by intrigeri 8 months ago

  • Target version changed from Tails_3.10.1 to Tails_3.11

#26 Updated by CyrilBrulebois 6 months ago

  • Target version changed from Tails_3.11 to Tails_3.12

#27 Updated by anonym 5 months ago

  • Target version changed from Tails_3.12 to Tails_3.13

#28 Updated by u 3 months ago

  • Blocked by Bug #10068: Upgrade to Jenkins 2.x, using upstream packages added

#29 Updated by u 3 months ago

If I understand correctly, then upgrading the Priority Sorter plugin will magically fix that. bertagaz will probably have to do this upgrade for #10068 → so "blocked by".

#30 Updated by u 3 months ago

  • Assignee changed from bertagaz to intrigeri
  • Target version changed from Tails_3.13 to Tails_3.16

Once bertagaz did the Jenkins update (#10068), intrigeri will check, ~ before releasing 3.16 (July-Aug 2019), if this issue was indeed magically corrected by the update.

#31 Updated by intrigeri 3 months ago

(This problem generally arises when our CI is overloaded, which often happens in the last few days before a release. So once #10068 is done by end of June, either I should notice the problem before the 3.15 or 3.16 release. If I don't, I'll be happy to call this fixed.)

Also available in: Atom PDF