Project

General

Profile

Bug #14819

The "I hotplug a network device" step is broken in some environments

Added by anonym almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Test suite
Target version:
Start date:
10/09/2017
Due date:
% Done:

100%

Feature Branch:
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

For me and intrigeri (Debian Sid, qemu 2.10) it fails consistently (the guest doesn't detect the device after pluggin), and intrigeri thinks sib (Debian Stretch, qemu 2.8) has been affected, but not consistently.


Related issues

Blocks Tails - Feature #13240: Core work 2017Q4: Test suite maintenance Resolved 06/29/2017

Associated revisions

Revision ec4f8275 (diff)
Added by anonym almost 2 years ago

Test suite: try hot-plugging pcnet instead of a virtio network device.

I hope this fixes #14819, i.e. that the plugged virtio device never is
detected by the guest on Sid hosts, and occassionally on Stretch hosts
too.

Refs: #14819

Revision aac8f180 (diff)
Added by anonym almost 2 years ago

Test suite: hot-plug network device suitable for the host OS.

It seems:

  • 'virtio' is not detected on Sid
  • 'virtio' is works most of the time on on Stretch
  • 'pcnet' works all the time on Sid
  • 'pcnet' is not detected on Stretch

so this workaround is what makes most sense.

Will-fix: #14819

Revision 01f13a80 (diff)
Added by anonym almost 2 years ago

Test suite: wait a bit more for hot-plugged devices to be detected.

I have a hunch that 'virtio' devices might just need more time to be
detected by the kernel for some reason.

Refs: #14819

Revision e1701955 (diff)
Added by anonym almost 2 years ago

Test suite: add missing chomp.

cmd_helper returns STDOUT with a \n at the end. Thanks, intrigeri, for
pointing me in this direction!

Will-fix: #14819

Revision 979da528 (diff)
Added by anonym almost 2 years ago

Test suite: log which network device we try to hotplug.

For debugging purposes.

Refs: #14819

History

#1 Updated by anonym almost 2 years ago

  • Status changed from Confirmed to In Progress
  • Assignee changed from anonym to intrigeri
  • QA Check set to Ready for QA
  • Feature Branch set to test/14819-hotplug-e1000-instead

#2 Updated by intrigeri almost 2 years ago

  • Blocks Feature #13240: Core work 2017Q4: Test suite maintenance added

#3 Updated by intrigeri almost 2 years ago

  • % Done changed from 0 to 60

Code changes LGTM, I'll test and merge if happy :)

#4 Updated by intrigeri almost 2 years ago

  • % Done changed from 60 to 70

Fixes my problem, i.e. "works on my machine"! Will wait for Jenkins to confirm it does not break other stuff.

#5 Updated by intrigeri almost 2 years ago

  • Assignee changed from intrigeri to anonym
  • QA Check changed from Ready for QA to Dev Needed

Sadly, on Jenkins this breaks the very tests it fixes for me: https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-14819-hotplug-e1000-instead/2/, https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-14819-hotplug-e1000-instead/3/. Might it be that it depends on the guest (Tails) kernel version, and this change is needed + working only with Linux 4.13?

#6 Updated by intrigeri almost 2 years ago

Pushed test/14819-hotplug-e1000-instead+Linux-4.13 (i.e. the topic branch with current devel merged in), let's see. If this branch works better, then let's mark this ticket as blocked by #14789 and merge it either in stable (if/once it has Linux 4.13) or devel, depending on the outcome of #14789.

#7 Updated by intrigeri almost 2 years ago

  • % Done changed from 70 to 50

intrigeri wrote:

Pushed test/14819-hotplug-e1000-instead+Linux-4.13 (i.e. the topic branch with current devel merged in), let's see. If this branch works better,

It doesn't: https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-14819-hotplug-e1000-instead-linux-4.13/1/. So I think we really need a fix implemented in a different way, that works on Jenkins too.

#8 Updated by intrigeri almost 2 years ago

  • Target version changed from Tails_3.3 to Tails_3.5

#9 Updated by intrigeri almost 2 years ago

  • Target version deleted (Tails_3.5)

#10 Updated by anonym almost 2 years ago

  • % Done changed from 50 to 60
  • QA Check changed from Dev Needed to Ready for QA

Let's see how pcnet works on Jenkins (it works for me on my Sid system).

#11 Updated by anonym almost 2 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 60 to 100

#12 Updated by intrigeri almost 2 years ago

  • Status changed from Resolved to In Progress
  • % Done changed from 100 to 60

("I hope this fixes #14819" in the commit message closed this ticket, which was not intended I think)

#13 Updated by intrigeri almost 2 years ago

  • QA Check changed from Ready for QA to Dev Needed

FYI this change broke "I hotplug a network device" on my local Jenkins.

Meta: I'm happy to see you push tested stuff directly to stable and devel as per our ongoing experiment, but in this case you could not possibly test it enough before pushing (as previous attempts have shown, this is platform-dependent: what works on sid tends to break on Jenkins and vice-versa), so I'd rather see you push this kind of stuff in a topic branch first, in order to avoid regressing the test suite on Jenkins for our major branches. Fair enough? :)

#14 Updated by intrigeri almost 2 years ago

I still see this failing on an ISO built from the branch for #14819 at 6c23dc58e241abd46efba7f861baa1b4fdf2e811, that includes aac8f18098c52ceb017490d399fbce2f026c6897 and 01f13a806da5cc0c63e6d675de6659da4292cc30: the hotplugged interface never appears. That's on my local Jenkins, running lsb_release --short --codename on my isotester outputs "stretch" so according to the code comments we should be using virtio, which used to work just fine.

I'm starting to think the bug is not in our test suite at all, but either comes from an upgrade inside Tails itself, or on the test system. For example, it seems that the devel branch on lizard worked fine up to & including https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_devel/1111/ but started failing at https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_devel/1113/. The two corresponding ISO images had the exact same kernel installed. I see no relevant APT upgrades on the isotesters around this time, so I don't know what's going on. The diff between these two builds includes the switch from virtio to pcnet.

Random hunch: it's not simply because the output of lsb_release ends with a newline and we need cmd_helper('lsb_release --short --codename') == "stretch\n" to correctly detect Stretch systems, right?

#15 Updated by anonym almost 2 years ago

  • Assignee changed from anonym to intrigeri
  • % Done changed from 60 to 70
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:

Random hunch: it's not simply because the output of lsb_release ends with a newline and we need cmd_helper('lsb_release --short --codename') == "stretch\n" to correctly detect Stretch systems, right?

Thanks, a lot! You're right, a .chomp is missing! :S Fixed in e17019550339169ed1c20a0bfcba85db40e3fc35

I actually had that thought when I wrote this code, but must have come to the wrong conclusion. Sorry for the fallout out this!

#16 Updated by anonym almost 2 years ago

  • Assignee changed from intrigeri to anonym
  • QA Check changed from Ready for QA to Dev Needed

anonym wrote:

You're right, a .chomp is missing! :S Fixed in e17019550339169ed1c20a0bfcba85db40e3fc35

Despite this commit, the scenario is still failing! I've added debug logging with 979da5282bab19b32af2a24e6c7942d3a9a70307 to try to rule out more silly mistakes.

#18 Updated by anonym almost 2 years ago

  • Assignee changed from anonym to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:

https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_stable/1082/cucumberTestReport/spoofing-mac-addresses/ passed.

Right, the problem was that xml.sub('pcnet', 'virtio') is a noop; instead of sub() I should have used the version with a bang, sub!(), that mutates itself instead of returning a mutated copy.

#19 Updated by intrigeri almost 2 years ago

  • Feature Branch deleted (test/14819-hotplug-e1000-instead)

I don't think you really mean that :)

#20 Updated by intrigeri almost 2 years ago

979da5282bab19b32af2a24e6c7942d3a9a70307 says it adds logging for debugging purposes, but it also includes the code change that actually fixes the bug. JFTR I'd like to gently ask you to not extend the "woohoo I'm free to push test suite work directly to our major branches" to "woohoo I don't have to do atomic commits and document what they're doing". But I'm sure it was a one-off oversight and probably not something you intend to repeat :)

#21 Updated by intrigeri almost 2 years ago

  • Target version set to Tails_3.5

#22 Updated by intrigeri almost 2 years ago

  • Status changed from In Progress to Fix committed
  • Assignee deleted (intrigeri)
  • % Done changed from 70 to 100
  • QA Check changed from Ready for QA to Pass

This being said, woohoo, it works. I did not test on my sid system but I'm sure you did before pushing.

#23 Updated by intrigeri over 1 year ago

  • Target version changed from Tails_3.5 to Tails_3.4

#24 Updated by anonym over 1 year ago

  • Status changed from Fix committed to Resolved

Also available in: Atom PDF