Project

General

Profile

Bug #12067

Garbled QXL display after kexec breaks the "Memory erasure" scenario on Stretch

Added by intrigeri almost 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Test suite
Target version:
Start date:
12/23/2016
Due date:
% Done:

100%

Feature Branch:
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

I see that both on Jenkins (Jessie host) and locally (sid host).

What I've tried on a sid host:

  • dropping --reset-vga from the kexec call: no visible impact
  • giving the QXL video adapter more memory (ram='524288' vram='262144' vgamem='262144'): improves a little bit (I see the stars-based progress bar) but not enough for the "Happy dumping!" message to be correctly displayed
  • replacing the QXL video adapter with a virtio one: even the kexec message is not displayed
  • replacing the QXL video adapter with a virtio one + dropping --reset-vga from the kexec call: even the kexec message is not displayed

I don't know what else to try. A (temporary?) workaround could be to have the I shutdown and wait for Tails to finish wiping the memory step (that's only used in this scenario) not fail when it can't find MemoryWipeCompleted.png on the screen, but instead print some warning like "Cannot tell if memory wipe completed, but the timeout was reached, so let's see how well it has worked so far". At least this would allow us to check whether memory wiping works on Stretch, which is top-priority to me wrt. releasing 3.0~beta1.

Associated revisions

Revision edd8dcd7 (diff)
Added by intrigeri almost 3 years ago

Test suite: check memory wipe efficiency even when we can't tell whether it has successfully completed (refs: #12067).

Revision 2f0609e4 (diff)
Added by anonym almost 3 years ago

Make the memory wipe wait reliable again.

I've seen it take longer than the 10m sleep imposed by `debug=wipemem`
=> the domain is not running when we try to dump the memory and a
Libvirt::Error is thrown.

Refs: #12067

Revision 07dfb622 (diff)
Added by intrigeri almost 3 years ago

Wait longer after wiping memory in debug=wipemem mode (refs: #12067).

Our retrying loop that workarounds #12067 still can take more than 10 minutes,
and then the VM is powered off before we have a chance to dump and analyze
its memory.

History

#1 Updated by intrigeri almost 3 years ago

Meta: I'll deal with the part that's top-priority to me (getting memory erasure test results on feature/stretch) and then will send this to anonym's plate.

#2 Updated by intrigeri almost 3 years ago

  • Assignee changed from intrigeri to anonym
  • Priority changed from High to Normal

Workaround added. I'll let anonym triage this (is it worth trying to find a better solution?).

#3 Updated by intrigeri almost 3 years ago

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

#4 Updated by anonym almost 3 years ago

  • Assignee changed from anonym to intrigeri

I think your fix is buggy: see https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-stretch/78/artifact/build-artifacts/00:21:48_Memory_erasure.mkv

and the place it is reported in:
https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-stretch/78/artifact/build-artifacts/debug.log

At around 14:45 the VM is killed (and virt-manager displays "Waiting for guest domain to re-start") so I guess the 10 minute sleep of debug=memwipe is triggered before the attempt we attempt the 240 second timeout. I pushed these two untested commits which I hope will implement what you wanted robustly:

925ea0d924 Add sanity check.
2f0609e410 Make the memory wipe wait reliable again.

What do you think?

Oh, and yes, I do agree that this is an acceptable change. Let's just be careful to take this into account next time the amount of unwiped memory increases (the warning will help us!).

#5 Updated by intrigeri almost 3 years ago

What do you think?

Thanks! But for some reason this doesn't work either (does try_for take into account how much the looped over code takes to run?), so I've added 07dfb622f1119698612d17d89bd823c4c083090f on top.

#6 Updated by anonym almost 3 years ago

intrigeri wrote:

Thanks! But for some reason this doesn't work either

Because of a syntax error, fixed in a5028f0f8219dde923b191c0219a58569425db1e.

(does try_for take into account how much the looped over code takes to run?)

It does [when the syntax is correct! :)]

so I've added 07dfb622f1119698612d17d89bd823c4c083090f on top.

This still makes sense, IMHO, although it invalidates the sanity check I added, so I have reverted it.

#7 Updated by intrigeri almost 3 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 10 to 100

The test now works, and anonym agreed it was an acceptable solution, so I'm closing this. I guess we won't be looking for a better solution.

Also available in: Atom PDF