Project

General

Profile

Bug #17163

"SSH is using the default SocksPort" test suite scenario is fragile

Added by intrigeri 28 days ago. Updated 26 days ago.

Status:
Resolved
Priority:
Elevated
Assignee:
Category:
Test suite
Target version:
Start date:
Due date:
% Done:

100%

Feature Branch:
test/17163-ssh-stream-isolation+force-all-tests
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

I've initially spotted this on #16792, and then noticed that it affected all branches.
For example, this scenario failed 12 times in the last 16 test suite runs on the testing branch.
In every failed case, SSH immediately fails with "nc: connection failed, SOCKSv5 error: General SOCKS server failure".

Interestingly, none of the scenarios in ssh.feature are affected, while they do almost the same thing. The main differences are:

  • In tor_stream_isolation.feature we have the "I monitor the network connections of SSH" CPU hog running. This might make it harder for tor to do its job.
  • In ssh.feature, we use retry_tor so we retry up to MAX_NEW_TOR_CIRCUIT_RETRIES (default: 10); while in tor_stream_isolation.feature we only try once.

In short, the fragile scenario runs in a context that makes it more likely to fail, and does not retry.


Related issues

Related to Tails - Bug #17013: The "is properly stream isolated" test suite mechanism is fragile Confirmed
Related to Tails - Bug #16792: Upgrade our Chutney fork Needs Validation
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

Associated revisions

Revision 4353ed57 (diff)
Added by intrigeri 28 days ago

Test suite: make the "SSH is using the default SocksPort" scenario more robust (refs: #17163)

This scenario failed 12 times in the last 16 test suite runs on the testing
branch, so it's starting to be really annoying.

Every time, SSH immediately fails with "nc: connection failed, SOCKSv5 error:
General SOCKS server failure". I suspect there's a bug in our test suite, that
makes us try to use Tor before it's ready. Regardless, we have a very similar
test case in ssh.feature that is pretty robust: it never failed in the 16 test
suite runs I've analyzed. I see two main potential reasons for this difference:

- In tor_stream_isolation.feature we have the "I monitor the network
connections of SSH" CPU hog running. This might make it harder for tor to do
its job.
- In ssh.feature, we use retry_tor so we retry up to
MAX_NEW_TOR_CIRCUIT_RETRIES (default: 10); while in
tor_stream_isolation.feature we only run SSH once.

In short, the fragile scenario runs in a context that makes it more likely to
fail, and it does not retry. So it's not very surprising that it's more fragile.

Therefore, let's simply reuse the existing, robust implementation we have for
a test connecting to a SSH server on the Internet. In passing, we get rid
of one picture, which is always welcome.

Revision 810b5560
Added by segfault 27 days ago

Merge branch 'test/17163-ssh-stream-isolation+force-all-tests' into testing (Fix-committed: #17163)

History

#1 Updated by intrigeri 28 days ago

  • Related to Bug #17013: The "is properly stream isolated" test suite mechanism is fragile added

#2 Updated by intrigeri 28 days ago

  • Related to Bug #16292: On-screen keyboard not displayed in Buster when logged in in French added

#3 Updated by intrigeri 28 days ago

  • Related to Bug #16792: Upgrade our Chutney fork added

#4 Updated by intrigeri 28 days ago

  • Related to deleted (Bug #16292: On-screen keyboard not displayed in Buster when logged in in French)

#5 Updated by intrigeri 28 days ago

#6 Updated by intrigeri 28 days ago

  • Status changed from Confirmed to In Progress

#7 Updated by intrigeri 28 days ago

  • Feature Branch set to test/17163-ssh-stream-isolation+force-all-tests

The updated scenario passes locally, but I can't reproduce the bug here in the first place ⇒ let's see what Jenkins thinks.

#8 Updated by intrigeri 28 days ago

  • Target version changed from Tails_4.1 to Tails_4.0

#9 Updated by intrigeri 28 days ago

  • Status changed from In Progress to Needs Validation
  • Assignee deleted (intrigeri)

No related failures in 4 full test suite runs on Jenkins!

#10 Updated by segfault 27 days ago

  • Assignee set to segfault

#11 Updated by segfault 27 days ago

  • Status changed from Needs Validation to Fix committed
  • % Done changed from 0 to 100

#12 Updated by intrigeri 26 days ago

  • Status changed from Fix committed to Resolved

Also available in: Atom PDF