Project

General

Profile

Bug #16097

Memory erasure tests regression on the devel branch

Added by intrigeri 11 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Elevated
Assignee:
-
Category:
-
Target version:
Start date:
11/05/2018
Due date:
% Done:

100%

Feature Branch:
bugfix/16097-memory-erasure-on-shutdown
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

The memory erasure tests have been failing on the devel branch for a few weeks.

Significant changes between these commits:

  • Linux upgraded from 4.17.0-3 to 4.18.0-2 and accordingly, aufs4-standalone upgraded from 01543e47eae7653c7e9a35a7204301f8a0b3ca50 to bdda97c749604bb9ea3f19e0c1ffac9042e79f77 → I doubt that matters because our stable branch also includes this change and there, the tests pass
  • debian APT snapshot updated from 2018100901 to 2018101503 which includes at least systemd 237-3~bpo9+1 → 239-7~bpo9+1

Sadly, the build artifacts for these 2 ISO build jobs and most of the relevant APT snapshots have been GC'ed already. Still, comparing the .build-manifest for the last successful builds of stable and devel and filtering out those that were upgraded in Debian after 20181015, the only potential culprit I see is the systemd 237-3~bpo9+1 → 239-7~bpo9+1 upgrade, which happened on 20181009 but strictly after the 2018100901 debian APT snapshot.


Related issues

Related to Tails - Bug #16100: Fix systemd CVE-2018-15687 Rejected 11/05/2018
Blocks Tails - Feature #15507: Core work 2019Q1: Foundations Team Resolved 04/08/2018
Blocks Tails - Bug #16312: Enabling persistence in Buster leads to issues at shutdown Resolved 01/06/2019
Blocks Tails - Bug #16352: Fix systemd vulnerabilities: CVE-2018-16864, CVE-2018-16865 and CVE-2018-16866 Resolved 01/13/2019

Associated revisions

Revision 634e5a6d (diff)
Added by intrigeri 8 months ago

Fix memory erasure on shutdown with systemd v239 (refs: #16097).

Remounting /run with the "exec" option in /lib/systemd/system-shutdown/tails
does not work anymore with systemd v239, while it worked at least until systemd
v237. I could not find out why by reading systemd's NEWS file.

So let's instead do this there:

- For clean shutdown: in a new, dedicated service, started immediately before
final.target, which itself is a synchronization point that ensures this
service is started before the transition to systemd-shutdown and in turn to
the initramfs, where we finish the unmounting and other clean ups needed to
erase the memory.
- For emergency shutdown: in the udev watchdog script, before calling the
unclean shutdown code, which bypasses final.target and thus won't run
tails-remount-run-exec.service. Too bad we have to duplicate this mount
command but it seems that both instances will become unnecessary quickly
enough, once systemd DTRT™. Another way would be to manually start
tails-remount-run-exec.service from the udev watchdog script but I'm
concerned it will be unreliable when the boot medium has been unplugged.

Revision 290620df (diff)
Added by intrigeri 8 months ago

Mount a dedicated tmpfs on /run/initramfs instead of trying to remount /run with the "exec" option (refs: #16097).

My previous approach, i.e. "let's remount /run with the exec option via a unit
file started as part of the shutdown procedure", worked just fine for clean
shutdown. But it does not work for emergency shutdown, i.e. when the boot medium
is physically removed: for some reason (possibly missing bits in the memlockd
configuration), this service is not started, and then systemd-shutdown won't
return to the initramfs because /run/initramfs/shutdown is not executable.

So let's instead disregard /run and extract the initramfs into a dedicated
tmpfs, that we mount on /run/initramfs (where systemd-shutdown will look for
it), and that we mount without the "noexec" option.

Also, remove manual calls to eject(1):

- They increase chances that the shutdown process breaks due to missing
files locked in memory by memlockd.
- Their sole benefit is to ensure we physically eject the DVD. It's unclear if
this code is still needed nowadays. Regardless, starting with Tails 3.12, the
only supported use case for ISO and DVD is virtual machines, which are not
targeted by the emergency shutdown feature, which is about removing the
physical boot medium.

Revision 7f101d16
Added by intrigeri 8 months ago

Merge branch 'bugfix/16097-memory-erasure-on-shutdown' into devel (Fix-committed: #16097)

History

#1 Updated by intrigeri 11 months ago

#2 Updated by intrigeri 11 months ago

Interestingly, most of "system memory erasure on shutdown" passes: only the one about the aufs RW branch fails.

https://github.com/systemd/systemd/issues/8221 might be relevant for emergency shutdown, although it's reported to happen with v237, which works fine for us.

Next steps:

  • look at the videos from Jenkins, maybe they'll give some hints
  • reproduce locally, perhaps with nosplash and making systemd log more (and to the console)

#3 Updated by intrigeri 11 months ago

  • Related to Bug #16100: Fix systemd CVE-2018-15687 added

#4 Updated by intrigeri 10 months ago

  • Related to Bug #16184: Intermittent test failures on the devel branch: fails to login "Failed to fully start up daemon: Permission denied" added

#5 Updated by intrigeri 10 months ago

  • Assignee set to intrigeri

#6 Updated by intrigeri 9 months ago

#7 Updated by intrigeri 9 months ago

#8 Updated by intrigeri 9 months ago

  • Blocks Bug #16312: Enabling persistence in Buster leads to issues at shutdown added

#9 Updated by intrigeri 8 months ago

  • Status changed from Confirmed to In Progress

/lib/systemd/system-shutdown/tails is run but I see no trace of returning to the initramfs, which nicely explains the failure of the exact tests that rely on this mechanism (and "FindFailed: can not find MemoryWipeCompleted.png"), while the tests that verify that memory is erased on unmount and when processes are killed work just fine. Manually running /bin/mount -o remount,exec /run before halt fixes that; and doing this trick in the test suite fixes "Scenario: Erasure of the aufs read-write branch on shutdown". But /lib/systemd/system-shutdown/tails should do that itself so something is wrong.

#10 Updated by intrigeri 8 months ago

  • Related to deleted (Bug #16184: Intermittent test failures on the devel branch: fails to login "Failed to fully start up daemon: Permission denied")

#11 Updated by intrigeri 8 months ago

  • % Done changed from 0 to 20
  • Feature Branch set to bugfix/16097-memory-erasure-on-shutdown
  • Type of work changed from Research to Code

#12 Updated by intrigeri 8 months ago

My branch fixes the regular ("clean") shutdown but emergency shutdown tests still fail locally. Will investigate further.

#13 Updated by intrigeri 8 months ago

  • Assignee changed from intrigeri to lamby
  • % Done changed from 20 to 50
  • QA Check set to Ready for QA

Fixed! All memory erasure tests pass locally. Please review. A build with my current proposal was just started on Jenkins. FTR: our review guidelines.

Note that I've left my first approach in the Git history. Its commit message explains why things are broken and need fixing. I was tempted to rewrite history and merge this explanation into the commit message that implements the approach that works, but this time I figured it would be useful to have a trace of something I've tried and that does not work, not particularly for posterity but rather for whoever would be tempted to try that in the future.

Thanks in advance :)

#15 Updated by intrigeri 8 months ago

  • Blocks Bug #16352: Fix systemd vulnerabilities: CVE-2018-16864, CVE-2018-16865 and CVE-2018-16866 added

#16 Updated by lamby 8 months ago

Methodology:

  • Checked out bugfix/16097-memory-erasure-on-shutdown branch.
  • Built tails-amd64-bugfix_16097-memory-erasure-on-shutdown-3.12-20190113T2209Z-4c051c657b.iso (see attached tails-amd64-bugfix_16097-memory-erasure-on-shutdown-3.12-20190113T2209Z-4c051c657b.buildlog.xz.
  • Booted in QEMU:
  • Logged in
  • Shutdown from normal menu. Did not see any errors, delays or timeouts.
  • Booted again.
  • Shutdown from greeter. Did not see any errors, delays or timeouts.
  • Burnt to USB stick
  • Repeated above on X230.

#17 Updated by intrigeri 8 months ago

  • Status changed from In Progress to Fix committed
  • % Done changed from 50 to 100

#18 Updated by intrigeri 8 months ago

  • Assignee deleted (intrigeri)

#19 Updated by anonym 8 months ago

  • Status changed from Fix committed to Resolved

Also available in: Atom PDF