Project

General

Profile

Bug #10732

Fix tor-has-bootstrapped semantics on network reconnect

Added by intrigeri almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
Start date:
12/09/2015
Due date:
% Done:

100%

Feature Branch:
bugfix/10732-sanest-tor-has-bootstrapped-semantics
Type of work:
Research
Blueprint:
Starter:
Affected tool:

Description

It seems that it returns true before tor has reconnected to the Tor network.

"what the test suite needs is something that tells whether Tor currently thinks it should be working (be able to build circuits)"

I wonder if it happens in normal conditions, that is without restoring snapshots. But let me not try to pretend it's not on my plate ;)
I bet that once we've reached tails-tor-has-bootstrapped.target it never gets reset to non-active.
Which, I mean, is a matter of semantics. It currently means: "Tor has already bootstrapped once on this system".
I wonder if this is buggy in supported non-test-suite case, e.g. on network reconnect. Or if we're back to "please, test suite folks, tell me what semantics you need from that program".
maybe we should just restart tails-wait-until-tor-has-bootstrapped.service on network reconnect (not sure if it puts the target back to inactive state)


Related issues

Blocks Tails - Feature #7563: Update the automated test suite for Jessie ISO images Resolved 11/26/2014 01/15/2016

Associated revisions

Revision 09d396dc (diff)
Added by intrigeri almost 4 years ago

When (re-)setting up the network, let the rest of the system know that Tor is not working at the moment.

This matters e.g. if we have already bootstrapped: if we don't do that,
then tails-tor-has-bootstrapped.target is still active, so
tor-has-bootstrapped will return true as soon as is
active, that is as soon as it has open its ports... but it might not be
ready to build circuits yet.

Refs: #10732

Revision de6f34fa (diff)
Added by intrigeri almost 4 years ago

When the network goes down, let the rest of the system know that Tor is not working at the moment.

Refs: #10732

Revision 399dd2fb (diff)
Added by intrigeri almost 4 years ago

Apply fix wrt. tails-tor-has-bootstrapped.target status to the test suite too.

After restoring snapshots, the test suite runs its own implementation
of (part of) 10-tor.sh, so this fix really needs to go there as well
to address refs: #10732.

Revision ff09f1bf (diff)
Added by intrigeri over 3 years ago

Test suite: restart tails-tor-has-bootstrapped.target after deleting the Tor log file.

Otherwise, "tails-wait-until-tor-has-bootstrapped.service run, and it
runs so fast that the log is not removed, and our horrible
implementation of tor_is_ready() in our shell library depends on the Tor
log to be cleared", and then it reports that Tor is ready while it is
not yet.

refs: #10732

Revision d552c3b7 (diff)
Added by intrigeri over 3 years ago

Tor NM hook: restart tails-tor-has-bootstrapped.target after deleting the Tor log file.

This change is needed for exactly the same reason as the same change
we did in the test suite (ff09f1b).

refs: #10732

Revision 161a8865
Added by anonym over 3 years ago

Merge remote-tracking branch 'origin/bugfix/10732-sanest-tor-has-bootstrapped-semantics' into devel

Fix-committed: #10732

History

#1 Updated by intrigeri almost 4 years ago

  • Blocks Feature #7563: Update the automated test suite for Jessie ISO images added

#3 Updated by intrigeri almost 4 years ago

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10
  • Feature Branch set to bugfix/10732-sanest-tor-has-bootstrapped-semantics

#4 Updated by intrigeri almost 4 years ago

  • Subject changed from Investigate tor-has-bootstrapped behaviour on network reconnect to Fix tor-has-bootstrapped semantics on network reconnect
  • Assignee changed from intrigeri to anonym
  • % Done changed from 10 to 50
  • QA Check set to Ready for QA

#5 Updated by anonym almost 4 years ago

  • Assignee changed from anonym to intrigeri
  • % Done changed from 50 to 40
  • QA Check changed from Ready for QA to Dev Needed

Unfortunately this branch doesn't work. :/ I was immediately hit with Pidgin trying to connect at a time when Tor clearly wasn't working. To investigate further I re-ran the tests with the following local modification, which sort of compares it with the wait_until_tor_is_working() implementation in Wheezy where we rely on what Tor thinks the bootstrap progress is:

--- a/features/support/helpers/misc_helpers.rb
+++ b/features/support/helpers/misc_helpers.rb
@@ -138,6 +139,8 @@ end

 def wait_until_tor_is_working
   try_for(270) { $vm.execute('/usr/local/sbin/tor-has-bootstrapped').success? }
+  bootstrap = $vm.execute('tor_bootstrap_progress', :libs => 'tor').stdout
+  info_log("Tor bootstrap progress: #{bootstrap}")
 rescue Timeout::Error => e
   c = $vm.execute("journalctl SYSLOG_IDENTIFIER=restart-tor")
   if c.success?

For me it said that the Tor bootstrap progress was 80%. Then I looked at Tor's log and saw this:

Dec 10 03:16:15.000 [notice] Tor 0.2.7.5 (git-6184c873e90d93b2) opening new log file.
[...]
Dec 10 03:16:25.000 [notice] Bootstrapped 80%: Connecting to the Tor network
Dec 10 03:16:25.000 [notice] Signaled readiness to systemd

so the .target is reached immediately when systemd is notified of Tor's readiness. Also note that, contrary to what you say in your commit message of 09d396dc, Tor waits 10 seconds before doing notifying systemd; actually Tor just idles for 10 seconds -- this is what I said to you privately yesterday when talking about Tor's bootstrapping issues (#10238). A bit beside the point, perhaps...

In any case, I think I've found the fix, which can be demonstrated with this diff:

--- a/features/step_definitions/common_steps.rb
+++ b/features/step_definitions/common_steps.rb
@@ -103,8 +103,8 @@ def post_snapshot_restore_hook
   if $vm.has_network?
     if $vm.execute("systemctl --quiet is-active tor@default.service").success?
       $vm.execute("systemctl stop tor@default.service")
-      $vm.execute("systemctl --no-block restart tails-tor-has-bootstrapped.target")
       $vm.execute("rm -f /var/log/tor/log")
+      $vm.execute("systemctl --no-block restart tails-tor-has-bootstrapped.target")
       $vm.host_to_guest_time_sync
       $vm.spawn("restart-tor")
       wait_until_tor_is_working

The issue is that when tails-tor-has-bootstrapped.target is restarted it immediately makes tails-wait-until-tor-has-bootstrapped.service run, and it runs so fast that the log is not removed, and our horrible implementation of tor_is_ready() in our shell library depends on the Tor log to be cleared. Woo! :) So the corresponding reordering must be done in the 10-tor.sh NM hook.

To end with, I'm surprised that we have to touch 10-tor.sh at all. With systemd I'd expect us to be able to express a dependency such that tails-tor-has-bootstrapped.target stops when tor@default.service stops, and (re)starts with it too. If we can do that, all this would be much simpler. However, due to the Tor log issue, we would need to also do one of the following then:

  • We make tor@default.service remove Tor's log when it stops.
  • We switch to the GETINFO based implementation (maybe the issues we had with it are gone now?) for getting the bootstrap progress in our shell library, so tor_is_ready() doesn't depend on clearing the Tor log.

Super exciting fun stuff!

#6 Updated by intrigeri almost 4 years ago

  • Assignee changed from intrigeri to anonym

In any case, I think I've found the fix, which can be demonstrated with this diff:

[...]

The issue is that when tails-tor-has-bootstrapped.target is restarted it immediately makes tails-wait-until-tor-has-bootstrapped.service run, and it runs so fast that the log is not removed, and our horrible implementation of tor_is_ready() in our shell library depends on the Tor log to be cleared. Woo! :) So the corresponding reordering must be done in the 10-tor.sh NM hook.

Great! While I'm away, please / feel free to (on top of the proposed branch):

  1. apply this trick to the test suite
  2. apply the same trick in 10-tor.sh
  3. merge the result into feature/jessie

(We'll do another review of recent feature/jessie changes next week anyway :)

And if you're not comfortable with taking the changes to 10-tor.sh due to what follows, then just cherry-pick the bits that modify the test suite and keep going.

To end with, I'm surprised that we have to touch 10-tor.sh at all. With systemd I'd expect us to be able to express a dependency such that tails-tor-has-bootstrapped.target stops when tor@default.service stops, and (re)starts with it too. If we can do that, all this would be much simpler. However, due to the Tor log issue, we would need to also do one of the following then:

  • We make tor@default.service remove Tor's log when it stops.

(At least that one is trivial to implement... assuming it would not break anything unexpectedly.)

  • We switch to the GETINFO based implementation (maybe the issues we had with it are gone now?) for getting the bootstrap progress in our shell library, so tor_is_ready() doesn't depend on clearing the Tor log.

Super exciting fun stuff!

Right, yes, totally. I agree there's room for improvement here. I doubt the "target" concept works in a way that allows us to do what you suggest, but regardless I'm sure we can indeed bind this service and that target (or whatever we turn the latter into) more tighly, somehow -- we'll see :)

As as see it though, we're drifting a bit away of what's blocking #7563 and Tails 2.0 here, so please file a dedicated ticket (no milestone, assign to me unless you want it yourself) to investigate these improvements, and let's keep the ball rolling on what is my current primary focus on this ticket, OK?

#7 Updated by intrigeri over 3 years ago

  • Assignee changed from anonym to intrigeri

intrigeri wrote:

Great! While I'm away, please / feel free to (on top of the proposed branch):

  1. apply this trick to the test suite
  2. apply the same trick in 10-tor.sh
  3. merge the result into feature/jessie

I'm taking back this ticket to do that on the topic branch. Then we'll see, depending on anonym's availability vs. my scheduling needs for 2.0~beta1.

#8 Updated by intrigeri over 3 years ago

To end with, I'm surprised that we have to touch 10-tor.sh at all. With systemd I'd expect us to be able to express a dependency such that tails-tor-has-bootstrapped.target stops when tor@default.service stops, and (re)starts with it too.

I think we can actually do that with PartOf=tor.service, or similar. It has too much potential for breaking other stuff for me to want to do that at this point of the 2.0 cycle, though.

#9 Updated by anonym over 3 years ago

  • Status changed from In Progress to Fix committed
  • % Done changed from 40 to 100

#10 Updated by intrigeri over 3 years ago

  • Assignee deleted (intrigeri)
  • QA Check changed from Dev Needed to Pass

#11 Updated by anonym over 3 years ago

  • Status changed from Fix committed to Resolved

Also available in: Atom PDF