Migrate away from vmdebootstrap (and possibly from Vagrant)
We use vmdebootstrap to build the VM image used for building Tails ISO images with Vagrant (source:vagrant/definitions/tails-builder/generate-tails-builder-box.sh). vmdebootstrap barely made it in Buster and might not be in Bullseye.
Sources of inspiration:
And wrt. replacing Vagrant altogether, see:
#7 Updated by intrigeri over 1 year ago
Lars Wirzenius tentatively agreed to keep the package in sid a bit longer, e.g. the end of 2019-02 and best/worst case the Buster release
… but later changed his mind: vmdebootstrap going away in September, switch now. I've checked with Lars privately and he confirmed that he really does not want to see
vmdebootstrap in sid after September.
Let's check the consequences:
- time scope: between the time when
vmdebootstrapis removed from testing/sid (September) and the time when our build system has switched to something else
- who? people who satisfy all these criteria:
- They want to build Tails ISO images.
- They run testing/sid on their Tails development machine.
- They have not installed
vmdebootstrapbefore it gets removed from testing/sid.
- impact: can't build a Tails ISO
I doubt we want to migrate in a bugfix release and it's too late to do this in time for 3.9. Most likely, the following major release will be 3.12, scheduled for 2019-01-29, with a freeze date around mid-January, so the deadline to have this work well tested and ready for QA would be sometime at the end of 2018. The impact described above would affect people for ~4 months, during which we can document workarounds (e.g. installing
vmdebootstrap from Stretch). This was basically what I had in mind when I've negotiated a delay with Lars earlier this year and it still seems acceptable to me.
If we miss the end of 2018 deadline, then most likely the earliest we can do the switch is Tails 4.0 (~mid-2019) and then the problem will have affected people for ~9 months. I think that's too long, so let's try to avoid that if we can.
#9 Updated by lamby over 1 year ago
- Updated/stablish link for
READMEin upstream sources: https://sources.debian.org/src/vmdebootstrap/latest/debian/NEWS/
- Do we have any idea which of the replacements cited on https://lists.debian.org/debian-devel-announce/2018/07/msg00002.html we would want to move to?
- (In addition to suggest installing from
stretch, we also have https://snapshot.debian.org/
#26 Updated by lamby almost 1 year ago
I am writing this from the FT sprint. The consensus here that we should pause and change direction or otherwise widen the scope on this ticket for now. To elaborate, this is for three main reasons:
Firstly, the quasi-urgency to replace vmdebootstrap to create our base images was because vmdebootstrap had a Release Critical bug filed against it (in https://bugs.debian.org/910201) during September 2018 by the original maintainer, Lars Wirzenius. Entitled "vmdebootstrap should not be in buster", here he opines that whilst it "has a bad architecture that makes the software inflexible, difficult to modify, hard to test, and it is not suitable for large number of use cases" he did admit that "a number of people have managed to make use of it". Thus, I was tasked with finding a replacement as buster or sid lacking vmdebootstrap would make it technically inconvenient and awkward as well as "politically" strange given that we depend so much on Deban.
However, Niels Thykier and Jonathan Carter followed-up to the bug in March 2019 reporting that the removal of vmdebootstrap would affect builds of live images (see https://bugs.debian.org/922826) and, due to this, Niels Thykier added a "buster-ignore" tag in early April 2019, meaning that as the bug is no longer Release Critical with respect to the release of buster. It will now presumbly be part of the upcoming Debian release.
Secondly, it also appears that vmdebootstrap has gained a new -- albeit likely temporary -- maintainer which has actually resulted in a new upload to Debian unstable (version 1.11-2, see https://tracker.debian.org/news/1036916/accepted-vmdebootstrap-111-2-source-into-unstable/). There is therefore even less exigency to find a replacement project and there is even a potential route to report and fix bugs via this new "upstream" and, possibly, even to propose enhancements.
The last reason is that, given the now-lack of haste required we can spend time later looking at larger solutions to building Tails that do not involve Vagrant. This is because the weak-ish consensus is that nobody is truly happy with using Vagrant especially as it's the cause of most build problems, and it is at least in my personal experience. A number of (entirely un-evaluated) options were mooted IRL including using lxc or Docker (which was actually proof-of-concepted some time ago, but it may not be in buster due to $reasons). intrigeri elaborated that Vagrant was chosen because at the time no other virtualisation options supported building in a reasonably secure manner; the other solutions at the time required highly-privileged access to the point where the "build system could even power down the host."
As part of the work on this ticket, I evaluated a number of replacement options, possibly still extant on this ticket's corresponding feature branch, but I have not re-checked at this time.
Firstly, I spent a few hours looked at debos (https://github.com/go-debos/debos). However, it did not support one of the filesystem image options that we needed so I hit a bit of dead-end in getting an actually-working image. Whilst it seemed like it would be essentially trivial to add (my impression was that the Go code is quite straightforward to hack on and it appears to continue to receive good/regular upstream attention), I concluded that requiring changes that are not in the version in buster itself would essentially leave us in the same place as vmdebootstrap. In other words, requiring that we specify using package [versions] from unstable (or, worse, upstream git!) to build.
I then spent even longer looking at vmdb2 assuming that it would be more-or-less a drop-in replacement to vmdebootstrap. However, I ironically hit almost entirely similar issues regarding a single, missing filesystem configuration option, which was curious given that its aforementioned intention to be a drop-in replacement.
So, after hitting similar issues in both potential replacement projects, I then evaluated not using any helper or build tool whatsoever! To do this, I quickly hacked my local copy of vmdebootstrap to log which shell commands it was actually running and then duplicating them in the build script. I did this on the hypothesis that we aren't actually doing anything that strange or weird when building; essentially "just" calling debootstrap and copying that into a large enough disk image that then gets consumed by Vagrant. This appeared to work to some degree (although I was missing something at the very end...) and was indeed fairly easy to demonstrate to some degree. However, after looking at the resulting diff etc. I remained unclear that this is long-term sustainable as it essentially replicates one of the above tools without real error checking nor cleanup. For example, the loopback devices created by losetup(8) need some management that is difficult to accomplish in a shell script. Thus we would likely just be adding to the flaky nature of the build system (see also, Vagrant!), rather than making it any simpler or at least at the same level of flakiness.
I did not get around to evaluating FAI (https://fai-project.org/) as I had the preconception and very slight experience in that it is a little more complex so I intended to try it last. Ironically this apparent "complexity" may indicate that it would support all of our requirements, although with a little more of a learning curve.
I did try and research some other things but they are probably not relevant to the current status of this issue. Anyway, I hope this gives an overview and update of this ticket and, unless there are any objections and given the now lack of priority, I will remove myself as an assignee of this ticket. It may be worth renaming this ticket (or perhaps creating a separate one with some kind of blocking relation?) regarding questions about replacing Vagrant itself. After all, replacing this virtualisation solution might imply or even require a specific build tool anyway.
- Assignee deleted (
Unassigning as-per new Target version policy (re. https://tails.boum.org/contribute/working_together/roles/foundations_team/#tasks-management)
Sure, @intrigeri. So I can't actually see somewhere that I've pushed it to (no matching remote…) but this is a fairly new installation and it might have been blown away when I removed my GitHub mirror. Oh, no, wait.. I can find a random WIP commit that appears to be using vmdb2...? Attaching now without looking at it whatsoever just so it doesn't get lost.
- Priority changed from Elevated to Normal
- Target version changed from Tails_3.14 to Tails_5.0
- Feature Branch deleted (
@lamby, if you can remember or have kept notes: it would be sweet to know which filesystem configuration option are missing in debos and vmdb2, so that if for 5.0 we end up sticking to Vagrant with the need to migrate away from vmdebootstrap again, we don't need to do the same work from scratch. If you haven't kept track of this info, forget it.
Setting target version = 5.0 to ensure this is on our radar for Bullseye. Lowering priority for now; we can raise it again when it's clearer what's going to happen 1. with vmdebootstrap vs. Debian Live; 2. with our own build system vs. Vagrant.
it would be sweet to know which filesystem configuration option are missing in debos and vmdb2,
Unfortunately not. :( However, on the positive-side I do remember hitting them "fairly" quickly, so it is not something that would require in-depth. I think it was about having lots of free space in the image, so it might not be something that's insurmountable with a hybrid $TOOL-but-resize-afterwards approach... but don't quote me too strongly on that.