Project

General

Profile

Bug #16791

Buster: torstatus@tails.boum.org makes GNOME Shell crashes when unlocking the screen or restoring from suspend

Added by intrigeri 9 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Elevated
Assignee:
Category:
-
Target version:
Start date:
Due date:
% Done:

100%

Feature Branch:
https://salsa.debian.org/tails-team/tails/merge_requests/28
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

Quoting alienpup's report on tails-testers today:

I've just retested all three systems using this .img file:

I found one new regressions relative to 3.14:

All three systems fail to fully restore from suspend. More specifically, they do, but X graphics appears correctly for only for 1-2 seconds, after which the screen blanks again before a only portion of the screen returns. At this point X no longer responds. On the two Intel based laptops, the mouse is inoperative. On the AMD based desktop, the mouse remains operative, but nothing on-screen responds (to mouse or keyboard). All three systems are still running at this point, and I can CTRL+ALT to a vterm and log in (password set at boot). Running top reveals Xorg is sleeping, consuming "0" cpu and minimal RAM.


Related issues

Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed
Blocks Tails - Bug #16755: Call for testing: feature/buster (June 2019 edition) Resolved 06/18/2019

Associated revisions

Revision 61a0f801 (diff)
Added by intrigeri 8 months ago

TorStatus: actually call our custom destructor (refs: #16791)

At least on Buster, this GNOME Shell extension is disabled/enabled when
locking/unlocking the screen; and thus, when suspending/resuming.

I'm definitely no JS expert (far from it!) but it seems that our disable()
function was previously calling the parent class' destroy() method, bypassing
our custom _destroy() one. global.log() crude debugging confirms that our
_destroy() method was never called, which would explain the use-after-free we
see on #16791: the status file monitor is never disconnected and thus, it can
trigger an icon update after the icon object was freed.

Let's instead call our custom _destroy method, which will properly disconnect
the file status monitor and call the parent class' destroy() method.

Revision 6913b650
Added by intrigeri 8 months ago

Merge remote-tracking branch 'origin/bugfix/16791-torstatus-crashes-gnome-shell-on-unlock-and-resume' into feature/buster

Closes: #16791

History

#1 Updated by intrigeri 9 months ago

#2 Updated by intrigeri 9 months ago

sajolida says resuming from suspend works fine on a X201.

#3 Updated by intrigeri 8 months ago

  • Assignee set to intrigeri

intrigeri wrote:

sajolida says resuming from suspend works fine on a X201.

Not 100% fine. I should look into "Subject: almost black screen when waking up in Tails Buster".

#4 Updated by intrigeri 8 months ago

On resume (both in 190608-alienpup-hp-pavilion-dm4-buster-sudo-journalctl.out and 190608-alienpup-lenovo-tp-t410-buster-sudo-journalctl.out), I see that our torstatus GNOME Shell extension makes GNOME Shell segfault:

Jun 09 00:15:26 amnesia systemd[1]: Stopped target Tor has bootstrapped.
Jun 09 00:15:26 amnesia systemd[1]: Stopping Manage the flag file that indicates whether Tor has bootstrapped...
Jun 09 00:15:26 amnesia nm-dispatcher[13798]: req:2 'connectivity-change': start running ordered scripts...
Jun 09 00:15:26 amnesia kernel: iwlwifi 0000:01:00.0: Radio type=0x1-0x2-0x0
Jun 09 00:15:26 amnesia gnome-shell[10955]: Object St.Icon (0x5bf5986404c0), has been already deallocated — impossible to access it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
Jun 09 00:15:26 amnesia org.gnome.Shell.desktop[10955]: == Stack trace for context 0x5bf596804240 ==
Jun 09 00:15:26 amnesia org.gnome.Shell.desktop[10955]: #0   5bf5991e30e0 i   /usr/share/gnome-shell/extensions/torstatus@tails.boum.org/extension.js:64 (7d2a40d74160 @ 67)
Jun 09 00:15:26 amnesia org.gnome.Shell.desktop[10955]: #1   7ffdd6d01a80 b   resource:///org/gnome/gjs/modules/_legacy.js:82 (7d2a41eb0b80 @ 71)
Jun 09 00:15:26 amnesia org.gnome.Shell.desktop[10955]: #2   5bf5991e3058 i   /usr/share/gnome-shell/extensions/torstatus@tails.boum.org/extension.js:78 (7d2a40d74280 @ 92)
Jun 09 00:15:26 amnesia org.gnome.Shell.desktop[10955]: #3   7ffdd6d02a60 b   resource:///org/gnome/gjs/modules/_legacy.js:82 (7d2a41eb0b80 @ 71)
Jun 09 00:15:26 amnesia org.gnome.Shell.desktop[10955]: #4   7ffdd6d036d0 b   self-hosted:983 (7d2a41ef01f0 @ 515)
Jun 09 00:15:26 amnesia kernel: gnome-shell[10955]: segfault at 28 ip 00007d2a5ac68bea sp 00007ffdd6d009c0 error 4 in libst-1.0.so[7d2a5ac5b000+2c000]
Jun 09 00:15:26 amnesia kernel: Code: 48 89 d8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 31 db 48 83 c4 08 48 89 d8 5b 5d c3 0f 1f 40 00 41 54 55 48 89 f5 53 48 89 fb <4c> 8b 67 28 e8 7d 3b ff ff 48 8b 13 48 85 d2 74 05 48 3b 02 74 13

Then GNOME Shell is restarted and crashes again, apparently when the flag file torstatus monitors appears and the extension should update the icon:

Jun 09 00:15:32 amnesia systemd[1]: Starting Manage the flag file that indicates whether Tor has bootstrapped...
Jun 09 00:15:32 amnesia gnome-shell[13857]: Object St.Icon (0x63e9b035a750), has been already deallocated — impossible to access it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
Jun 09 00:15:32 amnesia kernel: gnome-shell[13857]: segfault at 28 ip 0000778770e1dbea sp 00007ffef01b2ad0 error 4 in libst-1.0.so[778770e10000+2c000]
Jun 09 00:15:32 amnesia kernel: Code: 48 89 d8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 31 db 48 83 c4 08 48 89 d8 5b 5d c3 0f 1f 40 00 41 54 55 48 89 f5 53 48 89 fb <4c> 8b 67 28 e8 7d 3b ff ff 48 8b 13 48 85 d2 74 05 48 3b 02 74 13
Jun 09 00:15:32 amnesia org.gnome.Shell.desktop[13857]: == Stack trace for context 0x63e9ae56e2c0 ==
Jun 09 00:15:32 amnesia org.gnome.Shell.desktop[13857]: #0   63e9ae923e10 i   /usr/share/gnome-shell/extensions/torstatus@tails.boum.org/extension.js:62 (77873dd74160 @ 36)
Jun 09 00:15:32 amnesia org.gnome.Shell.desktop[13857]: #1   7ffef01b3b10 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (77873edb0b80 @ 71)
Jun 09 00:15:32 amnesia org.gnome.Shell.desktop[13857]: #2   63e9ae923d88 i   /usr/share/gnome-shell/extensions/torstatus@tails.boum.org/extension.js:75 (77873dd74280 @ 70)
Jun 09 00:15:32 amnesia org.gnome.Shell.desktop[13857]: #3   7ffef01b4a80 I   resource:///org/gnome/gjs/modules/_legacy.js:82 (77873edb0b80 @ 71)
Jun 09 00:15:32 amnesia org.gnome.Shell.desktop[13857]: #4   7ffef01b56f0 b   self-hosted:983 (77873edf01f0 @ 515)
Jun 09 00:15:32 amnesia systemd[1]: Started Manage the flag file that indicates whether Tor has bootstrapped.

Finally, gnome-session gives up restarting GNOME Shell, which explains what alienpup sees (apart of the focused application, the desktop is unresponsive):

Jun 09 00:15:32 amnesia gnome-session-binary[10811]: WARNING: Application 'org.gnome.Shell.desktop' killed by signal 11
Jun 09 00:15:32 amnesia gnome-session-binary[10811]: Unrecoverable failure in required component org.gnome.Shell.desktop
Jun 09 00:15:32 amnesia gnome-session-binary[10811]: WARNING: App 'org.gnome.Shell.desktop' respawning too quickly
Jun 09 00:15:32 amnesia gnome-session[10811]: Unable to init server: Could not connect: Connection refused
Jun 09 00:15:32 amnesia gnome-session-f[14115]: Cannot open display: 

I've asked alienpup and sajolida to test again:

  • with xorg-driver=intel (just in case the icon deallocation is hardware specific, which I strongly doubt)
  • after disabling our torstatus GNOME Shell extension

#5 Updated by intrigeri 8 months ago

  • Subject changed from Buster fails to restore from suspend to Buster: torstatus@tails.boum.org makes GNOME Shell crashes when restoring from suspend

sajolida's video looks very much like GNOME Shell crashed, too.

#6 Updated by intrigeri 8 months ago

  • Assignee deleted (intrigeri)
  • Priority changed from Normal to Elevated
  • Type of work changed from Research to Code

I think that the _updateIcon method in torstatus should not assume that the this._icon object still exists. Instead, it should check if it does, and create it if needed. So most of the _init method needs to be moved to another method, so both _init and _updateIcon can use it.

I'll handle the follow-ups with alienpup and sajolida but @segfault, @lamby or @alant would surely be more efficient than me to fix this JavaScript => deassigning myself for now. The relevant code lives in https://salsa.debian.org/tails-team/tails/blob/master/config/chroot_local-includes/usr/share/gnome-shell/extensions/torstatus@tails.boum.org/extension.js.

Raising priority as IMO this blocks putting out a first official beta that we'll ask Tails contributors and enthusiasts to use in production.

#7 Updated by lamby 8 months ago

@intrigeri

I think that the _updateIcon method in torstatus should not assume that the this._icon object still exists

I worry this is papering over the problem. What we in-effect have is a double-free here, no? :) Also note that we are in a strange non-typical Javascript land here - usually if this was the web we would just do if (typeof this._icon === 'undefined') { return; } but I think what is happening is that this._icon does indeed /exist/ in terms of being a Javascript object at the time but the underling C/GNOME/blah binding it refers to does not, so doing anything with it breaks as above. I do not know how to test whether the ref exists from GNOME (or Gio?) scripts and my search fu is failing me right now…

#8 Updated by intrigeri 8 months ago

intrigeri wrote:

I've asked alienpup and sajolida to test again:

  • with xorg-driver=intel (just in case the icon deallocation is hardware specific, which I strongly doubt)
  • after disabling our torstatus GNOME Shell extension

sajolida did 3-5 suspend+resume cycles, with all for combinations of torstatus enabled/disabled and xorg-driver=intel or not, and could not reproduce the GNOME Shell crash anymore. So no new useful data point here except sajolida says GNOME Shell sometimes crashes when waking up the screen (that the screen locker puts to sleep), while the system has not been fully suspended.

#9 Updated by intrigeri 8 months ago

Dear @lamby,

I think that the _updateIcon method in torstatus should not assume that the this._icon object still exists

I worry this is papering over the problem.

This is fair.

What we in-effect have is a double-free here, no? :)

More like a use-after-free, no?

I think what is happening is that this._icon does indeed /exist/ in terms of being a Javascript object at the time but the underling C/GNOME/blah binding it refers to does not

Full ACK. There are many similar bug reports about plenty of other GNOME bits and extensions on the web. One thing I've noticed while browsing them is that apparently, an extension should basically assume it's disabled/enabled when locking/unlocking the screen (and thus, when suspending/resuming); at least, that's how other projects have fixed extremely similar issues.

To confirm this assumption, I guess we could log something when disable() and _destroy() are called.

Random guess: this._monitor_changed_signal_id is triggered on a partially destroyed instance of TorStatusIndicator. I'm wondering I see that we call a destroy() method:

function disable() {
tor_status_indicator.destroy();
}

… while in the TorStatusIndicator class there's a _destroy() method?

In other words, I wonder if our _destroy() method, that disconnects the file monitor, is ever called.

#10 Updated by lamby 8 months ago

More like a use-after-free, no?

Ye, you got me. :)

In other words, I wonder if our _destroy() method, that disconnects the file monitor, is ever called.

Indeed, that was my roughly brief reading of it too but (further) was unlikely to be solved abstractly by "just" thinking or analysing the code itself - I bet there are a myriad of timing issues when unsuspending as well as whether it has been called at all... :/

#11 Updated by intrigeri 8 months ago

  • Status changed from Confirmed to In Progress
  • Assignee set to intrigeri

I can reproduce part of the bug (GNOME Shell crashes and restarts) in a VM like this:

nmcli connection down 'Wired connection' ; gnome-shell-extension-tool --disable-extension=torstatus@tails.boum.org

I've done some global.log() debugging and I think I have a fix.

#12 Updated by intrigeri 8 months ago

  • Feature Branch set to bugfix/16791-torstatus-crashes-gnome-shell-on-unlock-and-resume

#13 Updated by intrigeri 8 months ago

  • Status changed from In Progress to Needs Validation
  • Assignee deleted (intrigeri)

I can't reproduce the bug anymore on this branch when locking/unlocking, nor with the simple reproducer I've documented above, so I think my fix is OK. I hope this was the only bug that triggered the behavior experienced by alienpup and sajolida. I'll ask them to test a new feature/buster once this is reviewed & merged.

#14 Updated by segfault 8 months ago

  • Assignee set to segfault

#15 Updated by segfault 8 months ago

  • Status changed from Needs Validation to In Progress
  • Assignee changed from segfault to intrigeri
  • Feature Branch changed from bugfix/16791-torstatus-crashes-gnome-shell-on-unlock-and-resume to bugfix/16791-torstatus-crashes-gnome-shell-on-unlock-and-resume,https://salsa.debian.org/tails-team/tails/merge_requests/28

I created a MR on GitLab and did the review there: https://salsa.debian.org/tails-team/tails/merge_requests/28

#16 Updated by segfault 8 months ago

  • Status changed from In Progress to Needs Validation

I pushed a commit. I tested that it works by changing the extension.js in a running feature/buster VM and reloading gnome-shell.

#17 Updated by segfault 8 months ago

  • Assignee deleted (intrigeri)

#18 Updated by intrigeri 8 months ago

  • Feature Branch changed from bugfix/16791-torstatus-crashes-gnome-shell-on-unlock-and-resume,https://salsa.debian.org/tails-team/tails/merge_requests/28 to https://salsa.debian.org/tails-team/tails/merge_requests/28

#19 Updated by intrigeri 8 months ago

  • Subject changed from Buster: torstatus@tails.boum.org makes GNOME Shell crashes when restoring from suspend to Buster: torstatus@tails.boum.org makes GNOME Shell crashes when unlocking the screen or restoring from suspend
  • Assignee set to intrigeri

#20 Updated by intrigeri 8 months ago

  • Blocks Bug #16755: Call for testing: feature/buster (June 2019 edition) added

#21 Updated by intrigeri 8 months ago

  • Status changed from Needs Validation to Resolved
  • % Done changed from 0 to 100

Also available in: Atom PDF