Project

General

Profile

Bug #15146

Make memory erasure feature compatible with overlayfs

Added by intrigeri over 2 years ago. Updated 1 day ago.

Status:
In Progress
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
01/03/2018
Due date:
% Done:

0%

Feature Branch:
bugfix/15146-overlayfs-memory-erasure+force-all-tests
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

The "Erasure of the overlayfs read-write branch on shutdown" test scenario fails. But interestingly, "Tails erases memory on DVD boot medium removal: overlayfs read-write branch" passes. Looks like live-boot behaves slightly differently on overlayfs than on aufs, which seems to explain why mountpoints are not visible and can't be unmounted and thus cleaned.

We need to update the design doc (wiki/src/contribute/design/memory_erasure.mdwn) to document the overlayfs implementation instead of the aufs one.

aufs.png View (45.8 KB) intrigeri, 11/23/2019 05:45 PM

overlayfs.png View (44.1 KB) intrigeri, 11/23/2019 05:45 PM

exposed-upper-dir-log-2.png View (34.1 KB) intrigeri, 11/24/2019 06:42 AM

exposed-upper-dir-log-1.png View (47 KB) intrigeri, 11/24/2019 06:42 AM

exposed-upper-dir-log-3.png View (34.5 KB) intrigeri, 11/24/2019 06:42 AM

exposed-upper-dir-log-4.png View (58.4 KB) intrigeri, 11/24/2019 06:42 AM

exposed-upper-dir-log-5.png View (34.3 KB) intrigeri, 11/24/2019 06:42 AM

exposed-upper-dir-log-6.png View (35.9 KB) intrigeri, 11/24/2019 06:42 AM

02_10_33_Tails_erases_memory_on_DVD_boot_medium_removal__overlayfs_read-write_branch.png View (33.7 KB) intrigeri, 02/22/2020 06:54 AM

02_10_33_Tails_erases_memory_on_DVD_boot_medium_removal__overlayfs_read-write_branch.mkv (1.79 MB) intrigeri, 02/22/2020 06:55 AM

00_21_12_Tails_erases_memory_on_DVD_boot_medium_removal__overlayfs_read-write_branch.png View (39.9 KB) intrigeri, 03/08/2020 09:10 AM

00_21_12_Tails_erases_memory_on_DVD_boot_medium_removal__overlayfs_read-write_branch.mkv (2.12 MB) intrigeri, 03/08/2020 09:10 AM

03_25_07_Tails_erases_memory_on_DVD_boot_medium_removal__overlayfs_read-write_branch.png View (33.3 KB) intrigeri, 03/17/2020 01:01 PM

03_25_07_Tails_erases_memory_on_DVD_boot_medium_removal__overlayfs_read-write_branch.mkv (1.95 MB) intrigeri, 03/17/2020 01:02 PM


Related issues

Related to Tails - Feature #8415: Migrate from aufs to overlayfs Resolved 12/18/2014
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

Associated revisions

Revision f3ec8583 (diff)
Added by intrigeri 4 months ago

Let live-boot expose its /live/overlay as /lib/live/mount/overlay (refs: #15146)

/live/overlay (in the context of the initramfs) is the tmpfs
where the read-write branch of our union rootfs lives.

With aufs, this call to umount failed, and then live-boot would run:

mount -o move /live/overlay /root/lib/live/mount/overlay

As a result, this tmpfs mount was visible outside of the initramfs,
and our initramfs-pre-shutdown-hook could unmount it on shutdown,
which ensured the data stored in there was cleaned from memory.

But with overlayfs, for some reason this call to mount succeeds, even though the
overlayfs upper layer (/live/overlay/rw) is stored in this filesystem, which
shows that this tmpfs is still mounted. As a result, this tmpfs is not
visible anymore, and cannot be unmounted on shutdown, so the data stored
in there remains in memory, available to cold-boot attackers.

Let's not unmount this tmpfs and go back to the same behavior we had
with aufs.

This will probably require bringing back some AppArmor-related automated
tests, that were removed on the #8415 branch precisely because live-boot
did not expose the overlay branch:

12404eb883d8b68ab07242734d9a14a3d07f91ba
c6541323cb70f7c2d1af77bb407bef0d3d3e554a
a822c25f13c9673ffb39fb623e72c4d2894b112e
4ee7e8d041a29d78e15a1bbca4b5dba1c1e09296

Revision 97117dc9 (diff)
Added by intrigeri 4 months ago

WIP: on shutdown, empty the tmpfs read-write branch of the overlayfs mounted on / (refs: #15146)

Do this when stopping systemd-update-utmp.service, which is one of the last
things that happen before systemd umounts the filesystems.

WIP because ideally this would live in a dedicated unit,
properly ordered to stop at about that time.

Revision 3cdeadfe (diff)
Added by intrigeri 4 months ago

Let live-boot expose its /live/overlay as /lib/live/mount/overlay (refs: #15146)

/live/overlay (in the context of the initramfs) is the tmpfs
where the read-write branch of our union rootfs lives.

With aufs, this call to umount failed, and then live-boot would run:

mount -o move /live/overlay /root/lib/live/mount/overlay

As a result, this tmpfs mount was visible outside of the initramfs,
and our initramfs-pre-shutdown-hook could unmount it on shutdown,
which ensured the data stored in there was cleaned from memory.

But with overlayfs, for some reason this call to umount succeeds, even though the
overlayfs upper layer (/live/overlay/rw) is stored in this filesystem, which
shows that this tmpfs is still mounted. As a result, this tmpfs is not
visible anymore, and cannot be unmounted on shutdown, so the data stored
in there remains in memory, available to cold-boot attackers.

Let's not unmount this tmpfs and go back to the same behavior we had
with aufs.

This will probably require bringing back some AppArmor-related automated
tests, that were removed on the #8415 branch precisely because live-boot
did not expose the overlay branch:

12404eb883d8b68ab07242734d9a14a3d07f91ba
c6541323cb70f7c2d1af77bb407bef0d3d3e554a
a822c25f13c9673ffb39fb623e72c4d2894b112e
4ee7e8d041a29d78e15a1bbca4b5dba1c1e09296

Revision cb5921e4 (diff)
Added by segfault 4 months ago

On shutdown, empty the tmpfs read-write branch of the overlayfs mounted on / (refs: #15146)

Revision d604c714 (diff)
Added by segfault 4 months ago

Enable tails-remove-overlayfs-dirs.service (refs: #15146)

Revision 547b1560 (diff)
Added by segfault 4 months ago

Fix DefaultDependencies=no missing in tails-remove-overlayfs-dirs.service (refs: #15146)

Revision 007edc20 (diff)
Added by segfault 4 months ago

Fix tails-remove-overlayfs-dirs.service (refs: #15146)

The service is not stopped without Conflicts=shutdown.target.

Revision 79a29435 (diff)
Added by segfault about 1 month ago

Don't try to move overlay mount dir in initramfs-pre-shutdown-hook (refs: #15146)

With overlayfs, /lib/live/mount/overlay (the upper dir), is successfully
unmounted before our initramfs-pre-shutdown-hook is executed, so we
don't have to move and unmount it manually anymore, and the respective
commands fail.

Revision 7150555f (diff)
Added by segfault about 1 month ago

Don't try to move overlay mount dir in initramfs-pre-shutdown-hook (refs: #15146)

With overlayfs, /lib/live/mount/overlay (the upper dir), is successfully
unmounted before our initramfs-pre-shutdown-hook is executed, so we
don't have to move and unmount it manually anymore, and the respective
commands fail.

Revision 27d6fb0d (diff)
Added by segfault about 1 month ago

Don't try to move overlay mount dir in initramfs-pre-shutdown-hook (refs: #15146)

With overlayfs, /lib/live/mount/overlay (the upper dir), is successfully
unmounted before our initramfs-pre-shutdown-hook is executed, so we
don't have to move and unmount it manually anymore, and the respective
commands fail.

Revision 02104893 (diff)
Added by segfault about 1 month ago

Don't try to move overlay mount dir in initramfs-pre-shutdown-hook (refs: #15146)

With overlayfs, /lib/live/mount/overlay (the upper dir), is successfully
unmounted before our initramfs-pre-shutdown-hook is executed, so we
don't have to move and unmount it manually anymore, and the respective
commands fail.

Revision 411969e7 (diff)
Added by segfault about 1 month ago

Don't try to move overlay mount dir in initramfs-pre-shutdown-hook (refs: #15146)

With overlayfs, /lib/live/mount/overlay (the upper dir), is successfully
unmounted before our initramfs-pre-shutdown-hook is executed, so we
don't have to move and unmount it manually anymore, and the respective
commands fail.

Revision 2efc965a (diff)
Added by segfault about 1 month ago

Don't try to move overlay mount dir in initramfs-pre-shutdown-hook (refs: #15146)

With overlayfs, /lib/live/mount/overlay (the upper dir), is successfully
unmounted before our initramfs-pre-shutdown-hook is executed, so we
don't have to move and unmount it manually anymore, and the respective
commands fail.

Revision da4bc78a (diff)
Added by intrigeri 20 days ago

Memory erasure design doc: update for aufs → overlayfs (refs: #15146)

History

#1 Updated by intrigeri over 2 years ago

  • Target version set to 2018

#2 Updated by intrigeri over 2 years ago

anonym, you committed to handle the parent ticket this year (and I'm supposed to be the reviewer) but right now I fancy playing a bit with this ticket. If it becomes less fun I'll reassign to you.

#3 Updated by intrigeri almost 2 years ago

  • Related to Bug #15477: Consider upgrading to live-boot 1:20180328+ added

#4 Updated by intrigeri almost 2 years ago

  • Related to deleted (Bug #15477: Consider upgrading to live-boot 1:20180328+)

#5 Updated by intrigeri almost 2 years ago

  • Blocked by Bug #15477: Consider upgrading to live-boot 1:20180328+ added

#6 Updated by intrigeri almost 2 years ago

I don't think it's worth spending time on this before #15477 is done: perhaps the revamping of how mountpoints are managed by live-boot will fix this problem, perhaps it'll make it harder to solve, but in any case better do the overlayfs-specific work only once, after that other migration is done.

#7 Updated by intrigeri over 1 year ago

  • Target version changed from 2018 to Tails_3.11

#8 Updated by intrigeri over 1 year ago

#9 Updated by intrigeri over 1 year ago

  • Target version changed from Tails_3.11 to Tails_3.12

#10 Updated by intrigeri over 1 year ago

  • Target version changed from Tails_3.12 to Tails_3.13

#11 Updated by intrigeri over 1 year ago

#12 Updated by intrigeri over 1 year ago

#13 Updated by intrigeri about 1 year ago

  • Target version changed from Tails_3.13 to 2019

#14 Updated by intrigeri about 1 year ago

#15 Updated by intrigeri about 1 year ago

#16 Updated by intrigeri about 1 year ago

  • Assignee deleted (intrigeri)

#17 Updated by intrigeri 4 months ago

  • Description updated (diff)

#18 Updated by intrigeri 4 months ago

  • File aufs.png View added
  • File overlayfs.png View added
  • Status changed from Confirmed to In Progress
  • Assignee set to intrigeri
  • Feature Branch set to feature/8415-overlayfs+force-all-tests

#19 Updated by intrigeri 4 months ago

To test all this, I:

  • on the kernel command line, replace quiet with debug nosplash
  • run sudo touch /run/initramfs/tails_shutdown_debugging within Tails before shutting down

On the topic branch, the tmpfs where the read-write upper dir of the overlayfs lives is exposed as /lib/live/mount/overlay; it is successfully unmounted on shutdown before we switch back to the initramfs; same for the read-only branch (SquashFS). Then, the overlayfs itself is successfully unmounted in the initrd (initramfs-pre-shutdown-hook), so from my reading of the logs everything looks good, modulo some actions we take in the initrd are not necessary anymore (as some stuff has already been unmounted) and fail. But still, the known pattern we write in "Scenario: Erasure of the overlayfs read-write branch on shutdown" is not cleaned up.

I'm not sure what to do at this point. Hunches:

  • It could be that some of the seemingly successful unmount operations are done in a lazy way, and the tmpfs we want to clean up is still seen as internally mounted by the kernel. It could be interesting to run lsof or similar at various points of the shutdown process.
  • It could be that deleting the content of the tmpfs is necessary to clean the memory (we did rm -rf /mnt/live/overlay/* for aufs but it's a no-op here because the tmpfs has already been unmounted so there's nothing visible to delete there anymore at that point). We could try to do that in a systemd unit that stops just before Unmounting /lib/live/mount/overlay. I'll try that.

#20 Updated by intrigeri 4 months ago

  • Assignee deleted (intrigeri)
  • Feature Branch changed from wip/bugfix/15146-overlayfs-memory-erasure to bugfix/15146-overlayfs-memory-erasure

Looks like I got a PoC fix that works! I'm not sure if I should polish this or switch to #15281.

#21 Updated by intrigeri 4 months ago

Also, if we go the way my PoC branch does, we need to bring back some tests, see f3ec8583001a9a90861eede18ac5f330a560cad2.

#22 Updated by intrigeri 4 months ago

  • Blocked by deleted (Bug #15477: Consider upgrading to live-boot 1:20180328+)

#23 Updated by intrigeri 4 months ago

  • Status changed from In Progress to Needs Validation
  • Assignee set to intrigeri

The affected scenario now passes on the topic branch! Next steps: check that the reintroduced AppArmor-related test steps still make sense in overlayfs-world, verify that they pass, and finally merge into #8415.

#24 Updated by intrigeri 4 months ago

  • Status changed from Needs Validation to In Progress
  • Assignee deleted (intrigeri)

Next steps: check that the reintroduced AppArmor-related test steps still make sense in overlayfs-world,

They needed adjustments on the test suite side. They also showed that our AppArmor configuration needed to be adapted to the changes brought by 3cdeadfeadc28d93aed5356c5780b97dac75dc19. I did both, let's see what Jenkins thinks.

Also, unrelated to AppArmor, lots of the commands we have in initramfs-pre-shutdown-hook are now obsolete and thus fail loudly. IMO we should clean up this script: otherwise, next time we debug a problem in this area, we may get confused by all the error messages unneeded and failing operations trigger ⇒ back to "In Progress".

#25 Updated by intrigeri 4 months ago

intrigeri wrote:

Next steps: check that the reintroduced AppArmor-related test steps still make sense in overlayfs-world,

They needed adjustments on the test suite side. They also showed that our AppArmor configuration needed to be adapted to the changes brought by 3cdeadfeadc28d93aed5356c5780b97dac75dc19. I did both, let's see what Jenkins thinks.

This now looks good on Jenkins so I've merged this branch into #8415, to make it easier to analyze test suite results there. But I'm leaving this ticket in progress as more work is needed on this front IMO:

Also, unrelated to AppArmor, lots of the commands we have in initramfs-pre-shutdown-hook are now obsolete and thus fail loudly. IMO we should clean up this script: otherwise, next time we debug a problem in this area, we may get confused by all the error messages unneeded and failing operations trigger ⇒ back to "In Progress".

#26 Updated by intrigeri 4 months ago

  • Feature Branch changed from bugfix/15146-overlayfs-memory-erasure to feature/8415-overlayfs+force-all-tests

#27 Updated by intrigeri 4 months ago

  • Target version changed from 2019 to Tails_4.5

#28 Updated by intrigeri 4 months ago

  • Priority changed from Normal to High

#29 Updated by segfault about 1 month ago

  • Assignee set to segfault

#30 Updated by intrigeri about 1 month ago

  • Description updated (diff)

#31 Updated by intrigeri about 1 month ago

I've seen the "Tails erases memory on DVD boot medium removal: overlayfs read-write branch" scenario fail on the feature/6560-secure-boot+force-all-tests branch:

9.078% of the free memory still has the pattern, but less than 0.800% was expected.
<false> is not true. (Test::Unit::AssertionFailedError)
./features/step_definitions/erase_memory.rb:181:in `/^I find very few patterns in the guest's memory$/'
features/emergency_shutdown.feature:24:in `Then I find very few patterns in the guest's memory'

Attaching video & screenshot.

Note that this scenario is marked as fragile due to #13462, which is a different problem: it's about FindFailed: can not find MemoryWipeCompleted.png, while in the case at hand, the "Happy dumping!" message is displayed and the failure very much looks like we did not erase all memory. Could it be that tails-remove-overlayfs-dirs.service sometimes can't do its job is the boot medium was removed?

#32 Updated by segfault about 1 month ago

intrigeri wrote:

Also, unrelated to AppArmor, lots of the commands we have in initramfs-pre-shutdown-hook are now obsolete and thus fail loudly. IMO we should clean up this script: otherwise, next time we debug a problem in this area, we may get confused by all the error messages unneeded and failing operations trigger ⇒ back to "In Progress".

To find out which parts of initramfs-pre-shutdown-hook are still needed with overlayfs, on a separate branch, I deleted everything except for the sleep for dumping. The memory erasure test passed:

https://jenkins.tails.boum.org/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-no-pre-shutdown-hook/2/cucumber-html-reports/report-feature_3_2622904403.html

#33 Updated by segfault about 1 month ago

intrigeri wrote:

I've seen the "Tails erases memory on DVD boot medium removal: overlayfs read-write branch" scenario fail on the feature/6560-secure-boot+force-all-tests branch:

[...]

Attaching video & screenshot.

Note that this scenario is marked as fragile due to #13462, which is a different problem: it's about FindFailed: can not find MemoryWipeCompleted.png, while in the case at hand, the "Happy dumping!" message is displayed and the failure very much looks like we did not erase all memory.

This did not happen on Jenkins yet (in the 5 test suite runs of the branch).

Could it be that tails-remove-overlayfs-dirs.service sometimes can't do its job is the boot medium was removed?

Yes, I think that would be the case if, for the execution of the service, some file must loaded from the filesystem which wasn't cached. Could be that systemd needs to load something. Could even be /bin/rm which we execute in the script, but I doubt that, because /bin/rm is so ubiquitous that I would expect it to be cached.

IMO the right design would be to copy to a tmpfs all files needed for the memory erasure to work.

#34 Updated by segfault about 1 month ago

IMO the right design would be to copy to a tmpfs all files needed for the memory erasure to work.

Or we could just do it in the initramfs.

#35 Updated by intrigeri about 1 month ago

IMO the right design would be to copy to a tmpfs all files needed for the memory erasure to work.

Or we could just do it in the initramfs.

Indeed.

Note that we try to lock in memory (/etc/memlockd.cfg) all those files, so if the initramfs option does not work, it may be that the hard part is not to ensure the files are in memory (that's a mostly solved problem already), but rather to figure out which files are needed. Of course, the simpler the implementation is wrt. "listing the needed files", the easier it gets; the systemd-based one is not exactly simple in this respect.

#36 Updated by intrigeri about 1 month ago

segfault wrote:

intrigeri wrote:

I've seen the "Tails erases memory on DVD boot medium removal: overlayfs read-write branch" scenario fail on the feature/6560-secure-boot+force-all-tests branch: [...]

This did not happen on Jenkins yet (in the 5 test suite runs of the branch).

That's good to know. Note that my local Jenkins is much faster than lizard, so for example, when exposed to racy code, it tends to fails in different places/ways. I remember we did our best to make tails-remove-overlayfs-dirs.service non-racy, but who knows. I'm glad you're trying to eliminate it.

#37 Updated by intrigeri about 1 month ago

  • Target version changed from Tails_4.5 to Tails_4.4

(As per timeline discussion we had yesterday. This is about finishing the work, and possibly merging it into devel, not about merging it into stable yet: the current plan is to release this in 4.5, not 4.4.)

#38 Updated by segfault about 1 month ago

Or we could just do it in the initramfs.

Turns out we can't delete the tmpfs contents in the initramfs, because for some reason that's not clear to me, these tmpfs's are not mounted anymore when our initramfs-pre-shutdown-hook is executed. That can also be seen in the screenshot and video you posted above.

#39 Updated by segfault about 1 month ago

I'm now testing if we can remove the tmpfs contents in a script in /lib/systemd/system-shutdown/ instead of the systemd service. I pushed a commit for that to bugfix/15146-overlayfs-memory-erasure.

This did not happen on Jenkins yet (in the 5 test suite runs of the branch).

I still can't find a case where the memory erasure test failed on Jenkins on the #8415 branch, the #6560 branch, or bugfix/15146-overlayfs-memory-erasure. So if it still works in Jenkins with the shutdown script instead of the service, could you try to reproduce the issue on your local Jenkins with bugfix/15146-overlayfs-memory-erasure?

#40 Updated by intrigeri about 1 month ago

I still can't find a case where the memory erasure test failed on Jenkins on the #8415 branch, the #6560 branch, or bugfix/15146-overlayfs-memory-erasure. So if it still works in Jenkins with the shutdown script instead of the service, could you try to reproduce the issue on your local Jenkins with bugfix/15146-overlayfs-memory-erasure?

Sure, will do! I expect I'll be able to report results by the end of the week.

#41 Updated by intrigeri about 1 month ago

Hi segfault,

I still can't find a case where the memory erasure test failed on Jenkins on the #8415 branch, the #6560 branch, or bugfix/15146-overlayfs-memory-erasure.

Note that the latter (bugfix/15146-overlayfs-memory-erasure) does not run the test that I've reported as failing, because it's in features/emergency_shutdown.feature, which is tagged fragile.

So I'll add the +force-all-tests suffix to bugfix/15146-overlayfs-memory-erasure locally so my local Jenkins runs that test.

#42 Updated by intrigeri 29 days ago

I've run features/erase_memory.feature and features/emergency_shutdown.feature 3 times on my local Jenkins, including fragile tests, from the bugfix/15146-overlayfs-memory-erasure branch at 7edfb08488cfc35c2a0493b6fce984a4914ec0f7. Each of these 3 runs had exactly one failing scenario:

Scenario: Erasure of the overlayfs read-write branch on shutdown         # features/erase_memory.feature:64
Given I have started Tails from DVD without network and logged in      # features/step_definitions/snapshots.rb:170
And I prepare Tails for memory erasure tests                           # features/step_definitions/erase_memory.rb:60
When I fill a 128 MiB file with a known pattern on the root filesystem # features/step_definitions/erase_memory.rb:194
 # ensure the pattern is in memory due to tmpfs, not to disk cache
And I drop all kernel caches                                           # features/step_definitions/erase_memory.rb:211
Then patterns cover at least 128 MiB in the guest's memory             # features/step_definitions/erase_memory.rb:154
Pattern coverage: 100.000% (128 MiB out of 128 MiB reference memory)
When I trigger shutdown                                                # features/step_definitions/erase_memory.rb:215
And I wait 20 seconds                                                  # features/step_definitions/common_steps.rb:861
Slept for 20 seconds
Then I find very few patterns in the guest's memory                    # features/step_definitions/erase_memory.rb:178
Pattern coverage: 9.169% (128 MiB out of 1396 MiB reference memory)
9.169% of the free memory still has the pattern, but less than 0.800% was expected.
<false> is not true. (Test::Unit::AssertionFailedError)
./features/step_definitions/erase_memory.rb:181:in `/^I find very few patterns in the guest's memory$/'
features/erase_memory.feature:73:in `Then I find very few patterns in the guest's memory'

#43 Updated by segfault 29 days ago

  • Feature Branch changed from feature/8415-overlayfs+force-all-tests to bugfix/15146-overlayfs-memory-erasure+force-all-tests

Note that the latter (bugfix/15146-overlayfs-memory-erasure) does not run the test that I've reported as failing, because it's in features/emergency_shutdown.feature, which is tagged fragile.

Ack. I renamed the branch to bugfix/15146-overlayfs-memory-erasure+force-all-tests.

I've run features/erase_memory.feature and features/emergency_shutdown.feature 3 times on my local Jenkins, including fragile tests, from the bugfix/15146-overlayfs-memory-erasure branch at 7edfb08488cfc35c2a0493b6fce984a4914ec0f7. Each of these 3 runs had exactly one failing scenario:

The same scenario fails on (our shared) Jenkins. I suspect that's because, according to a comment in the file, config/chroot_local-includes/lib/systemd/system-shutdown/tails is not run " by the other instance of systemd-shutdown that's run (as /shutdown) after returning to the initramfs during shutdown". I don't really understand what that means though (I don't know anything about two instances of systemd-shutdown, and why deleting files on a filesystem in one of those instances wouldn't delete them for the other instance). Anyway, since it did work in most cases with tails-remove-overlayfs-dirs.service, I'm now trying to use that again, but also locking it in memory via /etc/memlockd.cfg.

#44 Updated by intrigeri 29 days ago

I suspect that's because, according to a comment in the file, config/chroot_local-includes/lib/systemd/system-shutdown/tails is not run " by the other instance of systemd-shutdown that's run (as /shutdown) after returning to the initramfs during shutdown". I don't really understand what that means though (I don't know anything about two instances of systemd-shutdown, and why deleting files on a filesystem in one of those instances wouldn't delete them for the other instance).

I see that both https://tails.boum.org/contribute/design/memory_erasure/ is a bit lacking in this respect: it does not say what systemd itself does, hence your confusion, I think.

The missing info is in https://www.freedesktop.org/wiki/Software/systemd/InitrdInterface/, in the "If the executable /run/initramfs/shutdown exists systemd will use it to jump back into the initrd on shutdown" bullet point. For the details, when I (re-re-)implemented this feature, I had to read the systemd source code.

One important thing, IIRC, is that "jump back into the initrd" is done via the equivalent of chroot.

Anyway, since it did work in most cases with tails-remove-overlayfs-dirs.service, I'm now trying to use that again, but also locking it in memory via /etc/memlockd.cfg.

OK!

#45 Updated by segfault 29 days ago

Anyway, since it did work in most cases with tails-remove-overlayfs-dirs.service, I'm now trying to use that again, but also locking it in memory via /etc/memlockd.cfg.

OK!

The memory erasure tests passed on Jenkins, so it would be nice if you could test that branch on your local Jenkins.

#46 Updated by intrigeri 28 days ago

The memory erasure tests passed on Jenkins,

Good news :)

so it would be nice if you could test that branch on your local Jenkins.

Sure, I'm on it!

Nitpicking: my understanding is that using the + prefix for /lib/systemd/system/tails-remove-overlayfs-dirs.service in memlockd.cfg is not useful/necessary here; IMO it can be a little confusing, letting the reader believe that memlockd will be clever enough to lock all dependencies of the service in memory, which AAUI is not the case.

#47 Updated by segfault 28 days ago

Nitpicking: my understanding is that using the + prefix for /lib/systemd/system/tails-remove-overlayfs-dirs.service in memlockd.cfg is not useful/necessary here; IMO it can be a little confusing, letting the reader believe that memlockd will be clever enough to lock all dependencies of the service in memory, which AAUI is not the case.

Right. I pushed a fixup commit.

#48 Updated by intrigeri 28 days ago

so it would be nice if you could test that branch on your local Jenkins.

Sure, I'm on it!

I'll report more complete stress-testing results later, but I can tell you that I've already seen "Scenario: Tails erases memory on DVD boot medium removal: overlayfs read-write branch" fail here. I'm attaching the corresponding screenshot and video, in the hope it might help understand what's going on.

I now have a big doubt wrt. the tails-remove-overlayfs-dirs.service -based implementation: config/chroot_local-includes/usr/local/lib/udev-watchdog-wrapper runs systemctl --force poweroff, i.e. "shutdown of all running services is skipped". So, if I understand this correctly, we cannot count on the ExecStop= command of tails-remove-overlayfs-dirs.service being run during emergency shutdown. This would explain why, in some cases, this service does not do its job on shutdown. But this does not explain why the overlayfs read-write branch is cleaned up most of the time, so perhaps I'm totally confused.

Either way, I'm wondering: maybe it would be more robust to move the /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work command from /lib/systemd/system/tails-remove-overlayfs-dirs.service to /lib/systemd/system-shutdown/tails? IIRC (without checking), executables under /lib/systemd/system-shutdown/ are run by systemd even when --force was passed. At the very least, this would simplify things and avoid the need to reason about the exact behavior of systemctl --force poweroff.

#49 Updated by intrigeri 28 days ago

I'll report more complete stress-testing results later

I've run features/emergency_shutdown.feature 5 times here and I saw the failure only once.

I see that when it fails, once we've gone back to the initramfs, this is still mounted:

udev on /oldroot/dev type devtmpfs […]
tmpfs on /oldroot/run type tmpfs […]
overlay on /oldroot type overlay (rw,noatime,lowerdir=//filesystem.squashfs/,upperdir=/live/overlay//rw,workdir=/live/overlay//work)

And then /oldroot cannot be unmounted: "Device or resource busy". I suspect that as long as the overlayfs is mounted:

  • The overlayfs kernel module has a copy of the upperdir/workdir data in memory, even if we rm -rf'd it on disk (I would not be surprised: AFAIK, manually fiddling with the on-disk upperdir/workdir data is not exactly supported :)
  • Or its backing upperdir and workdir are still around, even if /live/overlay was successfully (but perhaps lazily) umounted. In most cases this probably does not matter because we've rm -rf'ed their content, but when that fails — like here apparently — it does matter.

So I would see value in ensuring /oldroot is unmounted, either on top of, or instead of, trying to make /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work more robust.

Could it be because of the /oldroot/dev and /oldroot/run mountpoints? The "Move /oldroot/* mountpoints out of the way" comment suggests that they might cause trouble. Note that sd-umount tried to do that itself earlier, but that fails with "Device or resource busy" as well. So I would suggest we mount --move them outside of /oldroot: if we can't unmount them, fine, but let's at least ensure that won't prevent us from unmounting /oldroot itself?

#50 Updated by segfault 28 days ago

intrigeri wrote:

so it would be nice if you could test that branch on your local Jenkins.

Sure, I'm on it!

I'll report more complete stress-testing results later, but I can tell you that I've already seen "Scenario: Tails erases memory on DVD boot medium removal: overlayfs read-write branch" fail here. I'm attaching the corresponding screenshot and video, in the hope it might help understand what's going on.

I now have a big doubt wrt. the tails-remove-overlayfs-dirs.service -based implementation: config/chroot_local-includes/usr/local/lib/udev-watchdog-wrapper runs systemctl --force poweroff, i.e. "shutdown of all running services is skipped". So, if I understand this correctly, we cannot count on the ExecStop= command of tails-remove-overlayfs-dirs.service being run during emergency shutdown. This would explain why, in some cases, this service does not do its job on shutdown. But this does not explain why the overlayfs read-write branch is cleaned up most of the time, so perhaps I'm totally confused.

Yes, that confuses me too, because without this service, erase_memory.feature fails every time (see https://jenkins.tails.boum.org/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-no-ovlerlayfs-dirs-removal/). But if systemctl --force poweroff skips ExecStop commands, the service shouldn't do anything, so emergency_shutdown.feature should also fail every time.

Either way, I'm wondering: maybe it would be more robust to move the /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work command from /lib/systemd/system/tails-remove-overlayfs-dirs.service to /lib/systemd/system-shutdown/tails? IIRC (without checking), executables under /lib/systemd/system-shutdown/ are run by systemd even when --force was passed. At the very least, this would simplify things and avoid the need to reason about the exact behavior of systemctl --force poweroff.

That is exactly what I tried in 7edfb08488cfc35c2a0493b6fce984a4914ec0f7, which caused the emergency shutdown test to fail every time, see #15146#note-42 and following.

I've run features/emergency_shutdown.feature 5 times here and I saw the failure only once.

I see that when it fails, once we've gone back to the initramfs, this is still mounted:

[...]

And then /oldroot cannot be unmounted: "Device or resource busy". I suspect that as long as the overlayfs is mounted:

  • The overlayfs kernel module has a copy of the upperdir/workdir data in memory, even if we rm -rf'd it on disk (I would not be surprised: AFAIK, manually fiddling with the on-disk upperdir/workdir data is not exactly supported :)
  • Or its backing upperdir and workdir are still around, even if /live/overlay was successfully (but perhaps lazily) umounted. In most cases this probably does not matter because we've rm -rf'ed their content, but when that fails — like here apparently — it does matter.

So I would see value in ensuring /oldroot is unmounted, either on top of, or instead of, trying to make /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work more robust.

Could it be because of the /oldroot/dev and /oldroot/run mountpoints? The "Move /oldroot/* mountpoints out of the way" comment suggests that they might cause trouble. Note that sd-umount tried to do that itself earlier, but that fails with "Device or resource busy" as well. So I would suggest we mount --move them outside of /oldroot: if we can't unmount them, fine, but let's at least ensure that won't prevent us from unmounting /oldroot itself?

It's worth a try. I pushed a commit.

#51 Updated by intrigeri 26 days ago

Hi,

Could it be because of the /oldroot/dev and /oldroot/run mountpoints? The "Move /oldroot/* mountpoints out of the way" comment suggests that they might cause trouble. Note that sd-umount tried to do that itself earlier, but that fails with "Device or resource busy" as well. So I would suggest we mount --move them outside of /oldroot: if we can't unmount them, fine, but let's at least ensure that won't prevent us from unmounting /oldroot itself?

It's worth a try. I pushed a commit.

With this commit, I've run the relevant test 13 times on the affected machine and never saw it fail. For good measure, these 13 times include a few runs of the entire features/erase_memory.feature and features/emergency_shutdown.feature. During most of these runs, this was the only thing the machine was doing; for 2 runs I was running another test suite instance on another VM on the same hardware, to broaden the span of conditions I was exercising the thing in.

So either the race condition is still there but the odds of losing it are super low, or 450244d94129f445990390911d54ba94223decd8 did the trick. I'm personally satisfied to leave it at that.

If that passes on Jenkins too, I think we're good here (unless there was anything left, like design doc update? I don't remember).

#52 Updated by CyrilBrulebois 24 days ago

  • Target version changed from Tails_4.4 to Tails_4.5

#53 Updated by intrigeri 20 days ago

  • Assignee changed from segfault to intrigeri

I've:

  • made the branch build again on Jenkins, by merging feature/8415-overlayfs+force-all-tests into it
  • updated the design doc

If the relevant tests pass on Jenkins, I'll merge this into feature/8415-overlayfs+force-all-tests and will close this issue as resolved. Then, this can be reviewed as part of the #8415 + #6560 batch.

#54 Updated by intrigeri 19 days ago

intrigeri wrote:

If the relevant tests pass on Jenkins […]

Unfortunately, in 3 runs on Jenkins, "Scenario Tails erases memory on DVD boot medium removal: overlayfs read-write branch" failed once: https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-force-all-tests/5/cucumber-html-reports/report-feature_8_3124862301.html.
I'm attaching the relevant build artifacts, in the hope it helps debugging. I see that /oldroot could not be unmounted.

At this point I'm not sure we should block on this for 4.5~rc1, or even for 4.5. @segfault, what do you think?

#55 Updated by intrigeri 18 days ago

intrigeri wrote:

At this point I'm not sure we should block on this for 4.5~rc1, or even for 4.5. @segfault, what do you think?

I've thought a little bit about this. I'm quite torn because:

  • Our "erase memory on shutdown" feature has historically been pretty fragile. AFAICT, the only robust version of it is the current one (with aufs), to which we switched in Tails 3.0. We've never been super clear with users about whether this was a "works 90+ % of the time" thing or a "works 100% of the time, guaranteed" thing. So one may argue that going back to a somewhat fragile status in 4.5 is basically going back to business as usual, from a long-term perspective (and possibly we could adjust a bit our claims about this feature, if we don't manage to fix the overlayfs-based implementation really soon).
  • By nature of this security feature, depending on whether it's 100% reliable or not, informed users may make different security decisions. Given it's been reliable since almost 3 years, one may argue that any regression will put some users at risk.
  • I'm not aware of any real-world situation in which this feature has made a practical difference. Granted, people who benefited from it may not want to tell us about it, but still, we've had this feature for many years now.

I'm now leaning towards "accept the regression in 4.5~rc1, don't block on this to merge #8415 and friends, then treat it as high priority, and set a deadline in N months when we'll adjust our doc if it's not fixed yet".

#56 Updated by intrigeri 17 days ago

  • Status changed from In Progress to Needs Validation
  • Assignee changed from intrigeri to anonym

@anonym, I'd like your opinion too on the last 2 comments.

In passing, note that systemd v245 improves a bit the "unmount stuff on shutdown" code, so there's a little bit of hope that upgrading systemd will help.

#57 Updated by intrigeri 17 days ago

  • Feature Branch changed from bugfix/15146-overlayfs-memory-erasure+force-all-tests to feature/8415-overlayfs+force-all-tests

I've merged the current implementation into feature/8415-overlayfs+force-all-tests and in turn into feature/8415-overlayfs+force-all-tests, because that's the best we have at the moment.

#58 Updated by intrigeri 16 days ago

  • Feature Branch changed from feature/8415-overlayfs+force-all-tests to feature/6560-secure-boot+force-all-tests, https://salsa.debian.org/tails-team/tails/-/merge_requests/44/

#59 Updated by anonym 16 days ago

intrigeri wrote:

intrigeri wrote:

If the relevant tests pass on Jenkins […]

Unfortunately, in 3 runs on Jenkins, "Scenario Tails erases memory on DVD boot medium removal: overlayfs read-write branch" failed once: https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-force-all-tests/5/cucumber-html-reports/report-feature_8_3124862301.html.

Since the branch was merged, these jobs have now been deleted. Oh well, seems the screenshot has everything needed.

I'm attaching the relevant build artifacts, in the hope it helps debugging. I see that /oldroot could not be unmounted.

Oof, this looks like an ugly one! Some ideas:

  • Normally when mount fails you get the useful info in the journal, so perhaps you can add its tail to the debug logging to get more info about the problem?
  • I'm not at all sure about this one, but IIRC pivot_root can be a more effective way to change the root when it comes to @unmount@ing problematic mountpoints.
  • Since we are in the initramfs, I guess we are using busybox's mount? The error message seems to confirm this (I can find the string "mounting %s on %s failed" in busybox, bot not in util-linux). Perhaps it is buggy, as (in my experience) busybox alternatives sometimes are. What if we would use the real mount?
  • What about filling the tmpfs with zeros if we detect this error? Could be a cheap (?) workaround until we find something better.

intrigeri wrote:

At this point I'm not sure we should block on this for 4.5~rc1, or even for 4.5. segfault, what do you think?

I've thought a little bit about this. I'm quite torn because:

  • Our "erase memory on shutdown" feature has historically been pretty fragile. AFAICT, the only robust version of it is the current one (with aufs), to which we switched in Tails 3.0. We've never been super clear with users about whether this was a "works 90+ % of the time" thing or a "works 100% of the time, guaranteed" thing. So one may argue that going back to a somewhat fragile status in 4.5 is basically going back to business as usual, from a long-term perspective (and possibly we could adjust a bit our claims about this feature, if we don't manage to fix the overlayfs-based implementation really soon).
  • By nature of this security feature, depending on whether it's 100% reliable or not, informed users may make different security decisions. Given it's been reliable since almost 3 years, one may argue that any regression will put some users at risk.
  • I'm not aware of any real-world situation in which this feature has made a practical difference. Granted, people who benefited from it may not want to tell us about it, but still, we've had this feature for many years now.

I'm now leaning towards "accept the regression in 4.5~rc1, don't block on this to merge #8415 and friends, then treat it as high priority, and set a deadline in N months when we'll adjust our doc if it's not fixed yet".

I 100% agree with this. Also, this problem looks solvable to me, so I still have pretty good feelings about this sub-optimal move.

#60 Updated by anonym 16 days ago

  • Status changed from Needs Validation to In Progress
  • Assignee changed from anonym to intrigeri

#61 Updated by intrigeri 16 days ago

Hi,

anonym wrote:

Oof, this looks like an ugly one! Some ideas:

  • Normally when mount fails you get the useful info in the journal, so perhaps you can add its tail to the debug logging to get more info about the problem?

I'm not sure if, nor how, it could work, because:

  • At this stage of the emergency shutdown process, I doubt our test suite will manage to talk to the remote shell so the best we can hope for is to convince our shutdown scripts/hooks to do so.
  • At this late stage of the shutdown process, IIRC systemd has stopped journald and has redirected all output to the console.
  • AFAIK the extra info (apart of stdout/stderr that we already have) we can hope for is kernel errors, but we have them already in the video.

I'd love to be wrong.

IIRC, adding debug to the kernel command line makes the whole "return to the initramfs and shutdown" systemd thing much more verbose and IIRC the extra info is visible on the console. That would probably be my next step to debug this further.

  • I'm not at all sure about this one, but IIRC pivot_root can be a more effective way to change the root when it comes to @unmount@ing problematic mountpoints.

Interesting! I think current systemd's switch_root() tries to do that when returning to the initrd, but falls back to overmounting root. I did not check what the version from Buster does. It could be worth looking deeper into this.

  • Since we are in the initramfs, I guess we are using busybox's mount? The error message seems to confirm this (I can find the string "mounting %s on %s failed" in busybox, bot not in util-linux). Perhaps it is buggy, as (in my experience) busybox alternatives sometimes are. What if we would use the real mount?

Worth trying :)

  • What about filling the tmpfs with zeros if we detect this error? Could be a cheap (?) workaround until we find something better.

Do we have access to the relevant tmpfs at this stage? It seems to me that we don't.

Finally, my own hunch wrt. this occasional failure is: it's a race related to how the kernel lazily unmounts stuff vs. when/how systemd switches to the initramfs. Probably incorrect but oh well.

I'm now leaning towards "accept the regression in 4.5~rc1, don't block on this to merge #8415 and friends, then treat it as high priority, and set a deadline in N months when we'll adjust our doc if it's not fixed yet".

I 100% agree with this. Also, this problem looks solvable to me, so I still have pretty good feelings about this sub-optimal move.

OK, I'll unparent this ticket!

#62 Updated by intrigeri 16 days ago

  • Assignee deleted (intrigeri)
  • Parent task deleted (#8415)

#63 Updated by intrigeri 16 days ago

#64 Updated by intrigeri 10 days ago

Hi anonym & segfault,

anonym wrote:

intrigeri wrote:

I'm now leaning towards "accept the regression in 4.5~rc1, don't block on this to merge #8415 and friends, then treat it as high priority, and set a deadline in N months when we'll adjust our doc if it's not fixed yet".

I 100% agree with this. Also, this problem looks solvable to me, so I still have pretty good feelings about this sub-optimal move.

Does one of you have the spoons + availability to work on this in time for 4.5? I think that means having something merged by April 5.

#65 Updated by segfault 7 days ago

  • Feature Branch changed from feature/6560-secure-boot+force-all-tests, https://salsa.debian.org/tails-team/tails/-/merge_requests/44/ to bugfix/15146-overlayfs-memory-erasure+force-all-tests
I recreated the old feature branch and pushed two commits to test whether they fix the problem:
  • Added a sync at the beginning of the script in the hope that it waits until lazy unmounts are finished
  • Using /bin/mount as proposed by anonym

Now waiting for Jenkins test results

#66 Updated by segfault 2 days ago

  • Status changed from In Progress to Needs Validation
  • Assignee set to intrigeri

The relevant tests haven't failed in the 8 test runs we have until now. Doesn't mean that there is no race anymore but maybe we should just merge it in time for 4.5?

#67 Updated by intrigeri 1 day ago

Hi,

The relevant tests haven't failed in the 8 test runs we have until now. Doesn't mean that there is no race anymore but maybe we should just merge it in time for 4.5?

Thanks!
Looking at the code, I tend to agree: I can't imagine how these changes could make things worse.
I'll stress-test this a bit today on my local Jenkins, that exposed the problem more readily, and I'll merge unless I spot a regression here!

#68 Updated by intrigeri 1 day ago

Note: I've set the base branch to testing and merged current testing into this branch, to ensure we're actually testing what would land into 4.5 if we merge this branch.

#69 Updated by intrigeri 1 day ago

As our baseline, among 13 runs on the testing branch: this failure happened twice.
FWIW: it never failed since March 28.

#70 Updated by intrigeri 1 day ago

  • Status changed from Needs Validation to In Progress
  • Assignee changed from intrigeri to segfault

I've run the relevant scenario 10-15 times on each of lizard and my local Jenkins, and unfortunately, it failed 10-15% of the time on both systems ⇒ no measurable improvement AFAICT.

BTW, in the initramfs environment, are /bin/{mount,umount} really the "real" ones? I'm wondering if they could be symlinks to the busybox executable, in which case 228469faca4d2d1847623fa6d2dc6a6f9dada4cd would be a no-op.

Finally, about that same commit: why the addition to memlock.cfg? The /bin/{mount,umount} used in the script are those from the initramfs, not those from the root filesystem, so I don't understand how it's related to memlock.

Also available in: Atom PDF