Project

General

Profile

Bug #10733

Run our initramfs memory erasure hook earlier

Added by intrigeri over 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
12/09/2015
Due date:
% Done:

100%

Feature Branch:
bugfix/10733-run-memory-erasure-hook-earlier
Type of work:
Research
Blueprint:
Starter:
Affected tool:

Description

We're doing it in init-premount which is after e.g. udev is started (init-top) IIRC. We could perhaps avoid doing that, which might have several advantages:

  • save a few seconds on shutdown (it might matter especially for the emergency one)
  • work in a less heavily multitasking / event-driven environment, for more robust operation

Related issues

Related to Tails - Bug #10487: Improve VM and OOM settings for erasing memory Resolved 11/05/2015
Related to Tails - Bug #11786: System often crashes during/after memory wipe since Linux 4.6 Duplicate 09/08/2016
Blocks Tails - Bug #11588: Sometimes fails to boot from USB on Jenkins with I/O errors Resolved 07/22/2016
Blocks Tails - Bug #9707: Jessie: System sometimes does not poweroff after memory erasure Rejected 07/08/2015

Associated revisions

Revision 185f5387 (diff)
Added by intrigeri about 3 years ago

Run our initramfs memory erasure hook earlier.

The goal here is to:

  • save a few seconds on shutdown (it might matter especially for the
    emergency one);
  • work in a less heavily multitasking / event-driven environment, for
    more robust operation.

refs: #10733

Revision 71ac15e8 (diff)
Added by intrigeri about 3 years ago

Add memory_wipe to the prereqs of all init-top initramfs scripts that are not ours.

This is the only way to have guarantees as to the order in which those
scripts are executed. The practical effect is that when erasing memory,
our memory_wipe script will be run first, and since it shuts down the
system in the end, none of the other init-top scripts will be executed.

refs: #10733

Revision c6eb7f6e (diff)
Added by intrigeri about 3 years ago

Update design doc.

refs: #10733

Revision bd961bd8
Added by anonym almost 3 years ago

Merge remote-tracking branch 'origin/feature/from-intrigeri-for-2.6' into devel

Fix-committed: #5650, #6729, #6850, #8485, #10190, #10298, #10733, #10733, #11281, #11588, #11582, #11590

Revision 7f88af1d (diff)
Added by anonym almost 3 years ago

Revert "Run our initramfs memory erasure hook earlier."

This reverts commits 185f53877ba90b621dbade4aebc4903e07e6ea82 and
71ac15e8a9d84bbb66260530bbdf177fa8addffc.

Since we introduced this in Tails 2.6~rc1 we see a lot more issues with
memory wiping both in our automated tests (locally, and on Jenkins) and
among actual users.

Refs: #10733, #11786

History

#1 Updated by intrigeri over 3 years ago

  • Related to Bug #9707: Jessie: System sometimes does not poweroff after memory erasure added

#2 Updated by intrigeri over 3 years ago

  • Related to Bug #10487: Improve VM and OOM settings for erasing memory added

#3 Updated by intrigeri about 3 years ago

  • Target version changed from Tails_2.4 to Tails_2.5

#4 Updated by intrigeri about 3 years ago

  • Target version changed from Tails_2.5 to Tails_2.6

(Not suitable for a point-release.)

#5 Updated by intrigeri about 3 years ago

  • Subject changed from Condider running our initramfs memory erasure hook earlier to Consider running our initramfs memory erasure hook earlier

#6 Updated by intrigeri about 3 years ago

So, indeed the right place to do that would be init-top, and then we would need to add our memory_wipe script to prereqs in all other init-top scripts we don't want to run, since that's the only way to have a guarantee as to the order in which those scripts are executed.

#7 Updated by intrigeri about 3 years ago

  • Tracker changed from Feature to Bug

#8 Updated by intrigeri about 3 years ago

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10
  • Feature Branch set to bugfix/10733-run-memory-erasure-hook-earlier

#9 Updated by intrigeri about 3 years ago

Tested locally, at least it doesn't make things any worse: memory erasure works fine.

I've merged the branch for #10776 (that drops the fragile tag on the corresponding tests) into this one, so that Jenkins gives us some data points.

#10 Updated by intrigeri almost 3 years ago

  • Related to Bug #10776: Step "I shutdown and wait for Tails to finish wiping the memory" fails when memory wiping causes a freeze added

#11 Updated by intrigeri almost 3 years ago

  • Assignee changed from intrigeri to anonym
  • % Done changed from 10 to 50
  • QA Check set to Ready for QA

Automated tests pass on Jenkins and locally, so please review & merge. I guess we'll want to revert fc194b0dcbdf5a1f35763db22fcd5e78d773b6b5 after merging, since we have reasons (at least #10776) to believe that these scenarios will still be fragile on Jenkins, even with this branch merged.

#12 Updated by intrigeri almost 3 years ago

Good news! A virtual hardware change introduced in test/11588-usb-on-jenkins (499c630) makes memory erasure very fragile on Jenkins, and merging the branch for #10733 into it fixes that (https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-11588-usb-on-jenkins-10733/ starting from build 4). So indeed, now we're sure that this branch is not merely guesswork: there are cases in which it really helps :)

#13 Updated by intrigeri almost 3 years ago

  • Blocks Bug #11588: Sometimes fails to boot from USB on Jenkins with I/O errors added

#14 Updated by intrigeri almost 3 years ago

  • Subject changed from Consider running our initramfs memory erasure hook earlier to Run our initramfs memory erasure hook earlier

#15 Updated by intrigeri almost 3 years ago

  • Related to deleted (Bug #9707: Jessie: System sometimes does not poweroff after memory erasure)

#16 Updated by intrigeri almost 3 years ago

  • Blocks Bug #9707: Jessie: System sometimes does not poweroff after memory erasure added

#17 Updated by intrigeri almost 3 years ago

  • Related to deleted (Bug #10776: Step "I shutdown and wait for Tails to finish wiping the memory" fails when memory wiping causes a freeze)

#18 Updated by intrigeri almost 3 years ago

  • Blocks Bug #10776: Step "I shutdown and wait for Tails to finish wiping the memory" fails when memory wiping causes a freeze added

#19 Updated by intrigeri almost 3 years ago

I'd like to ease reviewing for the 2.6 RM, and to get automated tests running about the combination of all these changes ASAP in the 2.6 dev cycle. So, I've merged this work, along with the other major branches I'm proposing for 2.6, into the feature/from-intrigeri-for-2.6 integration branch (Jenkins builds and tests).

#20 Updated by anonym almost 3 years ago

  • Status changed from In Progress to Fix committed
  • Assignee deleted (anonym)
  • % Done changed from 50 to 100
  • QA Check changed from Ready for QA to Pass

#21 Updated by anonym almost 3 years ago

  • Assignee set to intrigeri
  • % Done changed from 100 to 80
  • QA Check changed from Pass to Dev Needed

Reopening because I was hit by #11730 several times when I ran the full automated test suite for Tails 2.6~rc1.

Intrigeri, do you have any hopes in debugging and fixing this in time for the final 2.6 release? Otherwise, let's revert.

#22 Updated by intrigeri almost 3 years ago

  • Status changed from Fix committed to In Progress
  • Assignee changed from intrigeri to anonym

Intrigeri, do you have any hopes in debugging and fixing this in time for the final 2.6 release? Otherwise, let's revert.

I doubt I'll have to debug & fix it in time, so please reassign to me for 2.8 if you decide to revert this change.

At this point, I don't know if this branch fixes things in more cases than it breaks them: we have seen cases in which it made things measurably better, and we have seen cases in which it may introduce regressions (although it's unclear if this branch introduced the regression; e.g. the kernel update would be another plausible cause). All of this is about virtual machines, and we have no data about bare metal. So I personally find it hard to decide whether reverting is the way to go, but I totally trust you to check that it indeed improves things => your call, of course!

#23 Updated by intrigeri almost 3 years ago

  • Related to Bug #11786: System often crashes during/after memory wipe since Linux 4.6 added

#24 Updated by intrigeri almost 3 years ago

Also see #11786.

#25 Updated by sajolida almost 3 years ago

The release notes snippet I wrote for this (and might get reverted) is:

   - Set up the trigger for RAM erasure on shutdown earlier in the boot
     process. This should speed up shutdown and make RAM erasure more robust.

(I'll try to reuse it for when this will get merged.)

#26 Updated by anonym almost 3 years ago

So I built an image where I reverted 185f53877ba90b621dbade4aebc4903e07e6ea82 and 71ac15e8a9d84bbb66260530bbdf177fa8addffc (i.e. all commits introduced for this feature) and the very first shutdown I tried I was dropped to the busybox shell. So while this branch might have an effect on this, it certainly isn't the only cause.

So I guess one of the many branches merged in bd961bd8ca6304375e750cf9bb29e4167134d93f is the culprit. I wonder, could it be the branch bumping the number of virtual CPUs? I'm gonna do some tests... but it seems quite clear that reverting this branch is not gonna solve anything.

#27 Updated by anonym almost 3 years ago

  • Assignee changed from anonym to intrigeri
  • QA Check changed from Dev Needed to Info Needed

anonym wrote:

I wonder, could it be the branch bumping the number of virtual CPUs? I'm gonna do some tests...

So out of 50 attempts of shutting Tails 2.6~rc1 down, all succeeded with only one virtual CPU, but with two virtual CPUs, 5 (10%) failed. And for the record, with the Tails ISO from my previous comments, the failure rate was also ~10%. Conclusion: the problems we see in the automated test suite are caused by bumping the numbers of virtual CPUs. Do you agree?

It remains to see if these initramfs changes are the cause of #11786, which I'm gonna try to deal with on that ticket.

#28 Updated by anonym almost 3 years ago

So, I'm tempted to ship this in Tails 2.6; the detailed reports on #11786 tells me that the memory wipe regression happens when we kexec, before the point where this branch makes any difference, so my guess it indeed is the new Linux version that is the culprit. I'm gonna test a bit on all hardware I have available...

#29 Updated by intrigeri almost 3 years ago

Conclusion: the problems we see in the automated test suite are caused by bumping the numbers of virtual CPUs. Do you agree?

Yes.

#30 Updated by intrigeri almost 3 years ago

  • Assignee changed from intrigeri to anonym

So, I'm tempted to ship this in Tails 2.6; the detailed reports on #11786 tells me that the memory wipe regression happens when we kexec, before the point where this branch makes any difference, so my guess it indeed is the new Linux version that is the culprit.

I'm confused: #11786#note-7 (with #10733 reverted) shows a failure after kexec'ing. But indeed, even reverting #10733 does not fix the problem for sonicsnail, so let's ship this in 2.6.

#31 Updated by anonym almost 3 years ago

  • Status changed from In Progress to Fix committed
  • Assignee deleted (anonym)
  • % Done changed from 80 to 100
  • QA Check changed from Info Needed to Pass

intrigeri wrote:

So, I'm tempted to ship this in Tails 2.6; the detailed reports on #11786 tells me that the memory wipe regression happens when we kexec, before the point where this branch makes any difference, so my guess it indeed is the new Linux version that is the culprit.

I'm confused: #11786#note-7 (with #10733 reverted) shows a failure after kexec'ing.

Oops, I failed to see the last comment.

But indeed, even reverting #10733 does not fix the problem for sonicsnail,

Exactly, and I see similar issues on some of my own hardware.

so let's ship this in 2.6.

=> closing this ticket (again).

#32 Updated by anonym almost 3 years ago

  • Status changed from Fix committed to Resolved

#33 Updated by anonym over 2 years ago

  • Blocks deleted (Bug #10776: Step "I shutdown and wait for Tails to finish wiping the memory" fails when memory wiping causes a freeze)

Also available in: Atom PDF