Project

General

Profile

Bug #16352

Fix systemd vulnerabilities: CVE-2018-16864, CVE-2018-16865 and CVE-2018-16866

Added by intrigeri 8 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
Start date:
01/13/2019
Due date:
% Done:

100%

Feature Branch:
bugfix/16352-16184-systemd-v240+force-all-tests
Type of work:
Code
Blueprint:
Starter:
Affected tool:


Related issues

Related to Tails - Bug #16184: Intermittent test failures on the devel branch: fails to login "Failed to fully start up daemon: Permission denied" Resolved 12/03/2018
Blocks Tails - Feature #15507: Core work 2019Q1: Foundations Team Resolved 04/08/2018
Blocked by Tails - Bug #16349: Stick to Tor 0.3.4 in Tails 3.12 Resolved 01/12/2019
Blocked by Tails - Bug #16097: Memory erasure tests regression on the devel branch Resolved 11/05/2018
Blocked by Tails - Bug #16073: Upgrade Linux to 4.19 Resolved 10/25/2018
Blocked by Tails - Bug #16072: Enable protected_fifos and protected_regular Resolved 10/25/2018

Associated revisions

Revision 5a7ada33 (diff)
Added by intrigeri 8 months ago

Enable the bugfix-16352-16184-systemd-v240-force-all-tests APT overlay (refs: #16352).

Revision 407d9a0f (diff)
Added by intrigeri 8 months ago

Install systemd (240-4~bpo9+0tails1) from our custom APT repo (refs: #16352, #16184)

Revision 9be4bef8 (diff)
Added by intrigeri 8 months ago

Let APT install the newest systemd among those available in stretch-backports and in our custom APT repo (refs: #16352, #16184)

This ensures our custom backport (240-4~bpo9+0tails1) is superseded
by the official one once the latter is uploaded.

Revision b8bf6b87
Added by intrigeri 8 months ago

Merge branch 'bugfix/16352-16184-systemd-v240+force-all-tests' into devel

(Fix-committed: #16352, #16184)

History

#1 Updated by intrigeri 8 months ago

#2 Updated by intrigeri 8 months ago

  • Related to Bug #16184: Intermittent test failures on the devel branch: fails to login "Failed to fully start up daemon: Permission denied" added

#3 Updated by intrigeri 8 months ago

Note that the most severe udev regressions brought by v240 (e.g. https://github.com/systemd/systemd/issues/11314) were fixed in the 240-4 upload. The remaining potential ones only affect sysvinit and are thus irrelevant here.

#4 Updated by intrigeri 8 months ago

  • Priority changed from Elevated to High

#5 Updated by intrigeri 8 months ago

  • Subject changed from Check what to do in Tails 3.12 wrt. CVE-2018-16864 and CVE-2018-16865 to Check what to do in Tails 3.12 wrt. CVE-2018-16864, CVE-2018-16865 and CVE-2018-16866
  • Status changed from Confirmed to In Progress

systemd v230 to v239 (inclusive) are affected by these 3 vulns, so reverting to 237-3~bpo9+1 won't help. They're all fixed in sid with 240-4.

Wrt. CVE-2018-16864, the approach in https://www.qualys.com/2019/01/09/system-down/system-down.txt for exploitation seems to rely on /var/log/journal/ and fsync(), i.e. having opted in for a persistent on-disk Journal, which we don't do in Tails (and even if we did, that would still be in RAM and I suspect the tight race they're exploiting would not work then). It's not clear whether that's the only way to exploit the bug. Qualys claims they've given up on this one.

Wrt. CVE-2018-16865, according to the Qualys report, successful exploitation requires an infoleak, which CVE-2018-16866 provides.

Wrt. CVE-2018-16866, first of all it has been "inadvertently fixed" in systemd v240. In earlier versions, to "read this out-of-bounds string", given we don't have persistent Journal storage, the attacker needs "a tty that we recorded to /var/run/utmp". We ship neither utempter nor gnome-pty-helper, and there's no local SSHd. Running logger -p emerg lalala does not print anything in GNOME Terminal. According to w(1) the only registered tty is tty1, where GDM is running, so I don't think the amnesia user can access it. There may be other methods to get a tty recorded to utmp though.

Our options seem to be:

  • A) Assume/hope that CVE-2018-16866 is not exploitable on Tails due to the lack of ways to get a tty registered in utmp.
  • B) Recompile the systemd we would ship if nothing else happens (239-12~bpo9+1) with the -fstack-clash-protection option (https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html).
  • C) Backport the fixes to systemd/stretch-backports i.e. v239.
  • D) Upgrade to systemd 240-4.

Note that:

  • The patches have been backported to v239 there: https://github.com/systemd/systemd-stable/commits/v239-stable
  • If nothing goes wrong, 240-4 will migrate to testing the day before our freeze, so there's a chance it lands into stretch-backports and we get it anyway. That would be an awfully bad timing. IMO, either we decide to upgrade to v240 and we starting testing builds with it now, or we decide to stick to v239 regardless of what's in stretch-backports when we freeze.
  • We have another incentive to upgrade to v240: v239 causes #16184, which is fixed in v240. OTOH v240 introduced a bunch of regressions, most of them should be fixed in the -4 upload but there might be more.

Discussion:

  • Option A feels risky. We lack the in-house expertise to build confidence in a conclusion about it.
  • If we decide to stick to v239, option B (compile-time hardening) seems much cheaper and a bit less risky than option C (backport fixes).
  • I find option D (upgrading to v240) very tempting because I'm worried that #16184 will cause all kinds of seemingly random user-facing regressions, which would cause breakage, user confusion, and increase workload on our help desk and FT. Discovering this after 3.12~rc1 is out would leave us with very bad options, i.e. upgrade to v240 in 3.12 final (scary!), or try to backport to v239 the commit that hopefully fixes the bug (not sure it'll be feasible and will work at all).

So my plan is, to start with:

  1. Ask the Debian systemd maintainers what's their plan wrt. stretch-backports. E.g. if they plan to upload 239-12~bpo9+2 with the fixes backported, this would make option C a bit more tempting.
  2. Backport 240-4 for Stretch, build an ISO and run our test suite on it, to get a first rough idea of how much it fixes and/or breaks stuff.

#6 Updated by intrigeri 8 months ago

  • Feature Branch set to bugfix/16352-16184-systemd-v240+force-all-tests

intrigeri wrote:

So my plan is, to start with:

Ask the Debian systemd maintainers what's their plan wrt. stretch-backports. E.g. if they plan to upload 239-12~bpo9+2 with the fixes backported, this would make option C a bit more tempting.

Done: https://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/2019-January/037902.html

  1. Backport 240-4 for Stretch, build an ISO and run our test suite on it, to get a first rough idea of how much it fixes and/or breaks stuff.

Prepared and built package locally, will push to https://salsa.debian.org/tails-team/systemd.

#7 Updated by intrigeri 8 months ago

Wrt. CVE-2018-16865, according to the Qualys report, successful exploitation requires an infoleak, which CVE-2018-16866 provides.

Actually, our sudo config for tails-debugging-info gives read access to the full Journal, so likely one can use it to read the out-of-bounds string, i.e. it provides exactly the infoleak needed to exploit CVE-2018-16865.

Our options seem to be:

  • A) Assume/hope that CVE-2018-16866 is not exploitable on Tails due to the lack of ways to get a tty registered in utmp.

If my reasoning above is correct: forget this one.

#8 Updated by intrigeri 8 months ago

Wrt. CVE-2018-16864, the approach in https://www.qualys.com/2019/01/09/system-down/system-down.txt for exploitation seems to rely on /var/log/journal/ and fsync(), i.e. having opted in for a persistent on-disk Journal, which we don't do in Tails (and even if we did, that would still be in RAM and I suspect the tight race they're exploiting would not work then). It's not clear whether that's the only way to exploit the bug. Qualys claims they've given up on this one.

jvoisin says "the trick is to have the thread scheduled while it's writing data. This could be done without fsync(), it would just take some time" and "I think that the race of CVE-2018-16864 might still be exploitable, it would only take more time; wouldn't it?".

All in all, he agrees with my "tl;dr: we really should fix these 3 vulns in 3.12" conclusion.

#9 Updated by intrigeri 8 months ago

  • Subject changed from Check what to do in Tails 3.12 wrt. CVE-2018-16864, CVE-2018-16865 and CVE-2018-16866 to Fix systemd vulnerabilities: CVE-2018-16864, CVE-2018-16865 and CVE-2018-16866
  • % Done changed from 0 to 10

#10 Updated by intrigeri 8 months ago

  • Blocked by Bug #16349: Stick to Tor 0.3.4 in Tails 3.12 added

#11 Updated by intrigeri 8 months ago

  • Blocked by Bug #16097: Memory erasure tests regression on the devel branch added

#12 Updated by intrigeri 8 months ago

Merged the branches for #16349, #16072, #16073 and #16097 to get more useful test suite results.

#13 Updated by intrigeri 8 months ago

  • Blocked by Bug #16073: Upgrade Linux to 4.19 added

#14 Updated by intrigeri 8 months ago

  • Blocked by Bug #16072: Enable protected_fifos and protected_regular added

#15 Updated by intrigeri 8 months ago

An ISO built from this branch and installed to a USB stick with Tails Installer boots fine on the 2 bare metal laptops I have handy. Both clean and emergency shutdown appear to work fine too on these machines, and features/erase_memory.feature features/emergency_shutdown.feature pass here (I was particularly worried about regressions in this area due to the regressions we've seen when upgrading to v239 i.e. #16097).

#16 Updated by lamby 8 months ago

Okay I think I've mostly caught up with the background of this. Whilst my gut is telling me that we should not push 240 out (at the very least, it appears to be a "jinxed" release!) given the fix for https://github.com/systemd/systemd/issues/9461 etc. and it passes testsuite and bare-metal, I think that https://salsa.debian.org/tails-team/systemd/commit/fa1fbc6183dd4f2461e3daf65cc988a1c937ec4b is indeed the way to go. Thanks for such a good explanation btw.

#17 Updated by intrigeri 8 months ago

  • Assignee deleted (intrigeri)
  • % Done changed from 10 to 50
  • QA Check set to Ready for QA

Full test suite run locally passed except:

  • MAC spoofing failure handling: known to be fragile and as said on #10774, I strongly suspect it's a real bug in Tails (and not only in the test suite); I did 26671c6e2c6361a12d284f0e95cdc78ecce9c146 to make sure the network interfaces are disabled as expected and the real MAC address is not leaked; after that, the only step of these scenarios that fails is the one looking for the desktop notification, which is what #10774 was originally about => I'm now confident there's no regression brought by this branch here.
  • Additional Software: seems to be a local-only problem as this test passed on Jenkins

So, despite v240 being a "jinxed" release as lamby says, I'm now confident in upgrading, compared to our other options.

Same branch for #16184.

#18 Updated by lamby 8 months ago

  • Assignee set to lamby
Mon 14 14:06 < intrigeri> lamby: OK. So please take #16352 + #16184 (same branch). And anonym offered to do the other remaining one (#16261)

Taking.

#19 Updated by lamby 8 months ago

LGTM. Methodology:

  • Checked out bugfix/16352-16184-systemd-v240+force-all-tests at 26671c6e2c6361a12d284f0e95cdc78ecce9c146
  • Built; see attached tails-amd64-bugfix_16352-16184-systemd-v240+force-all-tests-3.12-20190114T1422Z-26671c6e2c.buildlog.xz.
  • Booted in qemu:

  • Confirmed we are running 240-4~bpo9+0tails1:

  • Shutdown (no issues)
  • Booted again, remembering to enable an Administrator Password in the Tails Greeter (!).
  • Restarted some services, eg:

  • sudo halt:

§

(Unrelated to review: I note that https://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/2019-January/thread.html#37902 has had replies.)

#20 Updated by lamby 8 months ago

  • Assignee changed from lamby to intrigeri

#21 Updated by intrigeri 8 months ago

  • Status changed from In Progress to Fix committed
  • % Done changed from 50 to 100

#22 Updated by intrigeri 8 months ago

  • Assignee deleted (intrigeri)
  • Type of work changed from Research to Code

Thanks!

#23 Updated by anonym 8 months ago

  • Status changed from Fix committed to Resolved

Also available in: Atom PDF