Project

General

Profile

Bug #16868

Upgrade Vagrant box to Buster

Added by intrigeri 5 months ago. Updated 28 days ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
Build system
Target version:
Start date:
Due date:
% Done:

100%

Feature Branch:
feature/16868-buster-vagrant-box+force-all-tests
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Related issues

Related to Tails - Bug #17005: Upgrade to po4a 0.55 in Tails itself, in the Vagrant build box, on www.lizard, and on RM's systems Confirmed
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed
Blocks Tails - Feature #17165: Stop taking time-based APT snapshots of Stretch Resolved

Associated revisions

Revision 3282ddbd (diff)
Added by intrigeri 4 months ago

Vagrant: use APT snapshots that still exist (refs: #16868)

And this time, I've bumped their expiration date.

Revision da2f0d0d (diff)
Added by intrigeri 4 months ago

Adjust to Buster's debootstrap (refs: #16868).

Revision 2dcbbca5 (diff)
Added by intrigeri about 2 months ago

Adjust to Buster's debootstrap (refs: #16868).

Revision 23c771be (diff)
Added by intrigeri about 2 months ago

Vagrant: ensure the chroot has a /proc filesystem while running postinstall.sh (refs: #16868)

Otherwise, when udisks2.postinst runs "udevadm trigger", udevadm will fail to
detect that it's running in a chroot, and then it'll try to do work that can't
work in a chroot, which in turn breaks installation of udisks2.

Revision 79b987e9 (diff)
Added by intrigeri about 2 months ago

build-tails: wait for NTP to be disabled before setting the desired date (refs: #16868)

"timedatectl set-ntp false" merely queues up a job: it does not wait until NTP
is actually disabled. So far we did not notice that this was racing against us
setting a custom date, but systemd 241 it becomes obvious: systemd does not
allow other operations while one did not finish anymore.

Disabling NTP services can take a bit of time, so with a Buster Vagrant box,
when $TAILS_DATE_OFFSET is set (which we do for our
reproducibly_build_Tails_ISO_* Jenkins jobs), the build would fail like this:

+ as_root_do timedatectl set-ntp false
+ sudo http_proxy=http://192.168.122.10:3142 APT_SNAPSHOTS_SERIALS={"torproject":"2019100904","debian":"2019100904","debian-security":"2019101901"} TAILS_MERGE_BASE_BRANCH=1 GIT_COMMIT=844ef7072f5856be50635aba45d6ebd820e70205 GIT_REF=feature/16868-buster-vagrant-box+force-all-tests BASE_BRANCH_GIT_COMMIT=b2559c50ba6c3f839da89d475964160889f8fbad timedatectl set-ntp false
+ date --utc --date=+8 days +%F %T
Setting system time to 2019-10-27 09:52:54
+ DESIRED_DATE=2019-10-27 09:52:54
+ echo Setting system time to 2019-10-27 09:52:54
+ as_root_do timedatectl set-time 2019-10-27 09:52:54
+ sudo http_proxy=http://192.168.122.10:3142 APT_SNAPSHOTS_SERIALS={"torproject":"2019100904","debian":"2019100904","debian-security":"2019101901"} TAILS_MERGE_BASE_BRANCH=1 GIT_COMMIT=844ef7072f5856be50635aba45d6ebd820e70205 GIT_REF=feature/16868-buster-vagrant-box+force-all-tests BASE_BRANCH_GIT_COMMIT=b2559c50ba6c3f839da89d475964160889f8fbad timedatectl set-time 2019-10-27 09:52:54
Failed to set time: Previous request is not finished, refusing.

So let's wait for the NTP service to be disabled before using timedatectl to
configure stuff again.

Revision 75c1f297 (diff)
Added by intrigeri about 2 months ago

Vagrant: install po4a from Stretch in the basebox (refs: #16868)

We install it from our builder-jessie (sic) APT suite. Given we use time-based
APT snapshots even for our own overlay APT suites, this requires bumping the
corresponding snapshot to a version that includes the desired version of po4a.

Revision 82a68b7b
Added by anonym about 2 months ago

Merge remote-tracking branch 'origin/feature/16868-buster-vagrant-box+force-all-tests' into stable

Fix-committed: #16868

History

#1 Updated by intrigeri 5 months ago

#2 Updated by intrigeri 5 months ago

  • Feature Branch set to feature/16868-buster-vagrant-box

#3 Updated by intrigeri 4 months ago

  • Category set to Build system
  • Status changed from In Progress to Confirmed
  • Assignee deleted (intrigeri)
  • Target version deleted (Tails_4.0)

The build aborts after debootstrap, which seems to be successful (judging from the output) but probably returned a non-zero exit code; or for some other reason the next step live-build does fails.

I don't recall exactly why I started working on this. IIRC I was hoping this would solve another problem, such as the version mismatch issues we had with create-usb-image-from-iso. Anyway, I see no immediate need to finish this work ATM. Let's come back to it whenever we have a good reason to.

#4 Updated by intrigeri 4 months ago

  • Feature Branch changed from feature/16868-buster-vagrant-box to wip/feature/16868-buster-vagrant-box

#5 Updated by intrigeri about 2 months ago

  • Blocks Feature #17165: Stop taking time-based APT snapshots of Stretch added

#6 Updated by intrigeri about 2 months ago

intrigeri wrote:

Anyway, I see no immediate need to finish this work ATM. Let's come back to it whenever we have a good reason to.

#17165 gives us a good reason to finish this work: our time-based APT snapshots have grown to an unprecedented size and it would be sweet if we could remove the last user of the Stretch ones, that is: our Vagrant box.

#7 Updated by intrigeri about 2 months ago

  • Status changed from Confirmed to In Progress

#8 Updated by intrigeri about 2 months ago

  • Status changed from In Progress to Needs Validation
  • Assignee set to intrigeri
  • Target version set to Tails_4.0
  • Feature Branch changed from wip/feature/16868-buster-vagrant-box to feature/16868-buster-vagrant-box+force-all-tests

Good news! I understood and fixed the problem that blocked me last time.

Next step: diffoscope images built with Buster vs. Stretch Vagrant boxes. If they're identical or differ only in expected ways, this should be a no-brainer and we could merge this in time for zen and I to do #17165 on Oct 24.

#9 Updated by intrigeri about 2 months ago

  • Status changed from Needs Validation to In Progress

#10 Updated by intrigeri about 2 months ago

  • Status changed from In Progress to Needs Validation

(I need to test reproducible builds.)

#11 Updated by intrigeri about 2 months ago

  • Related to Bug #17005: Upgrade to po4a 0.55 in Tails itself, in the Vagrant build box, on www.lizard, and on RM's systems added

#12 Updated by intrigeri about 2 months ago

diffoscope'ing the ISO, here's the list of not-entirely-expected differences in the SquashFS (diffoscope spotted no differences outside of it):

  • /dev/console was added to the SquashFS → seems harmless
  • some tails.mo files differ; msgunfmt says that it's because POT-Creation-Date got removed → OK, fine
  • most translated pages in the included version of our website differ due to different versions of po4a → I've got a fix locally and need to wait for the next time-based snapshot of our custom APT repo to be taken, in a few hours, before I can test it
  • order or <script> stuff in the generated website differs, probably due to ikiwiki being updated → looks harmless

I've tried diffoscope'ing the USB images but "interestingly", diffoscope (v113 and v126) does not manage to do any better than a byte-for-byte comparison with xxd, whose output is not terribly useful. That's weird: IIRC it used to do better than this. Anyway, the only difference between our ISO and USB image should be the MBR, partition table, and FAT32 partition, which I can compare manually, so I'll do that.

#13 Updated by intrigeri about 2 months ago

  • Priority changed from Normal to Elevated
  • Target version changed from Tails_4.0 to Tails_4.1

I see some syslinux-related differences between the USB images. I suspect that the changes we did in #16748 were incomplete: for example, even if we run chroot/usr/bin/syslinux, that binary will probably look for its data files (MBR and such) in / and not in chroot/, so there's a change that we were still mixing up bits of Stretch (from the Vagrant box) with bits from Buster (from the chroot). If my hunch is right, then:

  • Short-term, upgrading the Vagrant box to Buster will fix this discrepancy, which is good and might fix some boot issues; OTOH it's probably too late to merge this into 4.0, so ideally we would merge this immediately after the 4.0 release, to unblock sysadmins on #17165 ⇒ adjusting target version & priority accordingly.
  • Longer-term, instead of chroot/usr/bin/syslinux, we should probably run chroot chroot /usr/bin/syslinux, to ensure this binary only uses files that are in the chroot, with a matching version; this may require bind-mounting some stuff in the chroot first.

@segfault, does this make sense to you? If it does, I'll file a ticket for that chroot/syslinux thing, unless you beat me to it.

Next (and hopefully last) things to do before I ask a review:

  • once po4a has reached our time-based APT snapshots: bump the corresponding snapshot used by the Vagrant build box, make it expire in a long while, and test my commit that installs that version of po4a: it should get rid of the differences in the translated pages of the included website, between testing and this branch → it does
  • test images on various bare metal machines → boots fine on ThinkPad X200 (legacy BIOS) and HP EliteBook 840G1 (UEFI)
  • check test suite results (the build reproducibility check already passes) → I've seen a full test suite run pass locally

#14 Updated by intrigeri about 2 months ago

  • Status changed from Needs Validation to In Progress

#15 Updated by intrigeri about 2 months ago

  • Status changed from In Progress to Needs Validation
  • Priority changed from Elevated to High

(Due to the disk space situation.)

#16 Updated by intrigeri about 2 months ago

  • Assignee deleted (intrigeri)

#17 Updated by anonym about 2 months ago

  • Assignee set to anonym

#18 Updated by anonym about 2 months ago

  • Status changed from Needs Validation to 11
  • % Done changed from 0 to 100

#19 Updated by anonym about 2 months ago

  • Assignee deleted (anonym)

Code looks great! Jenkins' automated tests as well as my manual tests too! Merged!

intrigeri wrote:

I see some syslinux-related differences between the USB images. I suspect that the changes we did in #16748 were incomplete: for example, even if we run chroot/usr/bin/syslinux, that binary will probably look for its data files (MBR and such) in / and not in chroot/, so there's a change that we were still mixing up bits of Stretch (from the Vagrant box) with bits from Buster (from the chroot)

I initially wanted the chroot approach so I'm totally on board with this!

If my hunch is right, then:

To verify this I guess we could just run the same commands under strace or similar to see which files are accessed.

  • Short-term, upgrading the Vagrant box to Buster will fix this discrepancy, which is good and might fix some boot issues; OTOH it's probably too late to merge this into 4.0, so ideally we would merge this immediately after the 4.0 release, to unblock sysadmins on #17165 ⇒ adjusting target version & priority accordingly.

This is now done!

  • Longer-term, instead of chroot/usr/bin/syslinux, we should probably run chroot chroot /usr/bin/syslinux, to ensure this binary only uses files that are in the chroot, with a matching version; this may require bind-mounting some stuff in the chroot first.

segfault, does this make sense to you? If it does, I'll file a ticket for that chroot/syslinux thing, unless you beat me to it.

Filed as #17179.

#20 Updated by intrigeri 28 days ago

  • Status changed from 11 to Resolved

Also available in: Atom PDF