Project

General

Profile

Bug #11521

Bug #10288: Fix newly identified issues to make our test suite more robust and faster

The check_tor_leaks hook is fragile

Added by intrigeri over 3 years ago. Updated 8 months ago.

Status:
Resolved
Priority:
Elevated
Assignee:
-
Category:
Test suite
Target version:
Start date:
06/10/2016
Due date:
% Done:

100%

Feature Branch:
test/11521-always-power-off-vm
Type of work:
Research
Blueprint:
Starter:
Affected tool:

Description

https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-5650-rngd/8/ failed (pcap attached) with:

calling as root: systemctl is-system-running
call returned: [0, "running\n", ""]
    And Tor is ready                                                                           # features/step_definitions/common_steps.rb:373
calling as root: date --rfc-2822
call returned: [0, "Fri, 10 Jun 2016 18:18:47 +0000\n", ""]
    Then Tails clock is less than 5 minutes incorrect                                          # features/step_definitions/time_syncing.rb:40
      Time was 0.436531902 seconds off
      Unexpected connections were made:
        #<OpenStruct mac_saddr="50:54:00:f3:87:dc", mac_daddr="52:54:00:53:3c:df", protocol="tcp", saddr="10.2.1.92", daddr="10.2.1.1", sport=55150, dport=5024>.
      <false> is not true. (Test::Unit::AssertionFailedError)
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:55:in `block in assert_block'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1588:in `call'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1588:in `_wrap_assertion'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:53:in `assert_block'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:163:in `block in assert'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1593:in `call'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:1593:in `_wrap_assertion'
      /usr/lib/ruby/vendor_ruby/test/unit/assertions.rb:130:in `assert'
      /var/lib/jenkins/workspace/test_Tails_ISO_feature-5650-rngd/features/support/helpers/firewall_helper.rb:68:in `assert_all_connections'
      /var/lib/jenkins/workspace/test_Tails_ISO_feature-5650-rngd/features/support/hooks.rb:274:in `After'

The pcap shows one initial packet (at time = 0) that's precisely the one that violates the assertion, and then a DCHP transaction 53s later, and then what looks like a Tor bootstrap with chutney 30s later. So, my theory is that the initial packet, that triggers the failure, is a leftover of the previous scenario, or something, and has nothing to do with the current scenario.

00_40_52_Clock_with_host_s_time_in_bridge_mode.pcap (58.9 KB) intrigeri, 06/10/2016 12:44 PM

00_41_12_Clock_with_host_s_time_in_bridge_mode.pcap (75.1 KB) bertagaz, 07/22/2016 03:35 AM

00_39_01_Clock_with_host_s_time_in_bridge_mode.pcap (64.9 KB) intrigeri, 07/29/2016 01:50 PM


Related issues

Related to Tails - Bug #16148: ICMPv6 leaks detected by test suite Resolved 11/23/2018
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

Associated revisions

Revision 3b9a83f0 (diff)
Added by anonym 9 months ago

Test suite: power off system under testing after scenario.

Until now we have relied on either one of the generated "snapshot
restore" steps or the "[Given] a computer" step to implicitly stop the
old VM when we move on to a new scenario. That meant the old VM was
still running during the new scenarios Before hooks. If the new
scenario is tagged @check_tor_leaks that means we start its sniffer
while the old VM is still running, possibly sending packets that then
affect the new scenario. That would explain some myserious "Unexpected
connections were made" failures we have seen (refs: #11521).

Revision 310e3d98
Added by anonym 9 months ago

Merge branch 'test/11521-always-power-off-vm' into stable

Fix-committed: #11521

History

#1 Updated by intrigeri over 3 years ago

IIRC it's the first time I see this one, so no target version / deliverable for.

#2 Updated by intrigeri over 3 years ago

  • Priority changed from Normal to Elevated

It has happened 4 times in the last 50 days on Jenkins.

#3 Updated by bertagaz over 3 years ago

intrigeri wrote:

It has happened 4 times in the last 50 days on Jenkins.

That's mentioned on #11087#note-9. Attaching the pcap just in case.

#4 Updated by intrigeri over 3 years ago

Here's the pcap for another similar failure.

#5 Updated by anonym almost 3 years ago

  • Target version set to Tails_2.12

#6 Updated by anonym almost 3 years ago

  • Target version changed from Tails_2.12 to Tails_3.0~rc1

#7 Updated by intrigeri almost 3 years ago

  • Target version changed from Tails_3.0~rc1 to Tails_3.0

#8 Updated by intrigeri over 2 years ago

  • Target version changed from Tails_3.0 to Tails_3.1

It would be nice to fix in 3.0 but that's not a blocker IMO => postponing.

#9 Updated by anonym over 2 years ago

  • Blocks Feature #13239: Core work 2017Q3: Test suite maintenance added

#10 Updated by anonym over 2 years ago

  • Target version changed from Tails_3.1 to Tails_3.2

#11 Updated by intrigeri over 2 years ago

  • Target version changed from Tails_3.2 to Tails_3.3

#12 Updated by intrigeri over 2 years ago

  • Blocks Feature #13240: Core work 2017Q4: Test suite maintenance added

#13 Updated by intrigeri over 2 years ago

  • Blocks deleted (Feature #13239: Core work 2017Q3: Test suite maintenance)

#14 Updated by intrigeri over 2 years ago

  • Target version changed from Tails_3.3 to Tails_3.5

#15 Updated by anonym about 2 years ago

  • Target version deleted (Tails_3.5)

#16 Updated by intrigeri about 2 years ago

  • Blocks deleted (Feature #13240: Core work 2017Q4: Test suite maintenance)

#17 Updated by intrigeri about 1 year ago

  • Related to Bug #16148: ICMPv6 leaks detected by test suite added

#18 Updated by anonym 9 months ago

  • Status changed from Confirmed to In Progress

#19 Updated by anonym 9 months ago

  • Target version set to Tails_3.15
  • % Done changed from 0 to 40
  • Feature Branch set to test/11521-always-power-off-vm

First of all, I grep@ed for "Unexpected connections were made" over all @debug.log:s for the past three months, and it occurred 56 times. I randomly picked three of them, and all three had this bad first packet, so this bug is still affecting us.

intrigeri wrote:

So, my theory is that the initial packet, that triggers the failure, is a leftover of the previous scenario, or something, and has nothing to do with the current scenario.

It is a great theory! :)

Note that this first packet is sent by the Tails under testing and not the router. Looking into our hooks (features/support/hooks.rb) I see that in none of the After hooks do we shut down the VM, instead we seem to rely on either one of the generated "snapshot restore" steps or the [Given] a computer step to implicitly get rid off the old VM... in the next scenario. That means the old VM is still running during the new scenarios Before hooks, where we start the sniffer. This finding is pretty conclusive: there is a window where the previous scenario's network traffic leaks into the next one's sniffer.

So, let's just shut down the VM in the last After hook.

#20 Updated by anonym 9 months ago

  • Status changed from In Progress to Needs Validation
  • Assignee deleted (anonym)

Out of 3 runs I only saw unrelated failures, so this branch doesn't seem to introduce any regression (and I don't see why @fragile scenarios would be any different so I'm ignoring that they are not full runs).

I propose we just merge this branch and close this ticket, and reopen (or file a new one) if the problem reappears.

#22 Updated by anonym 9 months ago

#23 Updated by intrigeri 9 months ago

  • Assignee set to intrigeri

#24 Updated by intrigeri 9 months ago

  • Status changed from Needs Validation to In Progress
  • Assignee changed from intrigeri to anonym

LGTM but just one thing: the only other time in this hook that we use $vm, we guard it with if $vm; the rationale you provided in 1c387ab37779308980c870157c8d6d0aa752294b seems to apply here as well, so maybe add if $vm here too?

#25 Updated by anonym 9 months ago

  • Status changed from In Progress to 11
  • % Done changed from 40 to 100

#26 Updated by anonym 8 months ago

  • Assignee deleted (anonym)

#27 Updated by intrigeri 8 months ago

  • Status changed from 11 to Resolved
  • Target version changed from Tails_3.15 to Tails_3.14.1

Also available in: Atom PDF