Project

General

Profile

Bug #15685

Test manually creating a disk image as a backup technique

Added by sajolida over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
06/25/2018
Due date:
% Done:

0%

Feature Branch:
Type of work:
Test
Blueprint:
Starter:
Affected tool:

Description

Some Tails contributor told me they were using a full disk image to backup their Tails. The benefit is that you have everything backed up at once, no problems with permissions, and a fully ready USB stick to replace your if you lose it or break it.

free space.png View (28.1 KB) sajolida, 07/26/2018 07:57 PM

gnome disks.png View (64.7 KB) sajolida, 07/29/2018 12:59 PM

persistence configuration.png View (67.8 KB) sajolida, 07/29/2018 12:59 PM

removed partition.png View (59.9 KB) sajolida, 07/29/2018 12:59 PM

partclone.png View (134 KB) sajolida, 07/29/2018 01:28 PM

partimag.png View (105 KB) sajolida, 07/29/2018 01:28 PM


Related issues

Related to Tails - Feature #14605: Improve documentation on "Manually copying your persistent data to a new USB stick" Resolved 09/05/2017
Related to Tails - Feature #12214: Document a way to manually backup persistent data Duplicate 02/06/2017

History

#1 Updated by sajolida over 1 year ago

  • Related to Feature #14605: Improve documentation on "Manually copying your persistent data to a new USB stick" added

#2 Updated by intrigeri over 1 year ago

FWIW that's what I do (only the persistent volume though).

#3 Updated by sajolida over 1 year ago

  • File free space.png View added
  • Assignee changed from sajolida to intrigeri
  • QA Check set to Info Needed

Copying only the persistent volume is interesting because:

  • You can create the image from the same Tails USB stick (you have to start without unlocking the persistence).
  • You save the time of creating and restoring the system partition.

As cons, you have to create a dummy persistent volume so you have a TailsData partition at the right place before doing the restoration. It's not complicated but it's weird and might lead to people wondering what this dummy persistent volume is about, confusing passwords, persistent features, etc.

intrigeri: I see on one of my Tails that there's a tiny free space between the system partition and the persistent storage. Is this on purpose? If we didn't had this space people could create an empty partition instead of having to create a dummy partition so that it ends up at the right place (if this tiny free place is important for some reason).

See screenshot in attachment.

#4 Updated by intrigeri over 1 year ago

  • Description updated (diff)

#5 Updated by intrigeri over 1 year ago

  • Assignee changed from intrigeri to sajolida

intrigeri: I see on one of my Tails that there's a tiny free space between the system partition and the persistent storage. Is this on purpose?

IIRC yes: udisks will align partitions it creates on physical blocks, which greatly improves performance (particularly on flash media storage).

If we didn't had this space people could create an empty partition instead of having to create a dummy partition so that it ends up at the right place (if this tiny free place is important for some reason).

As long as they create the 2nd partition using udisks (e.g. via GNOME Disks), it should be properly aligned, which means that in some cases there will be a bit of free space between the system partition and the 2nd one.

#6 Updated by blue9 over 1 year ago

I have been using a Clonezilla (https://clonezilla.org/) liveUSB to backup my Tails disk. I store the backups in a separate volume on the Clonezilla LiveUSB. Clonezilla is oldschool but seems to be well maintained and frequently updated. It works for the persistence volume alone or for a full disk image. (I've been using the latter technique to simplify the restoration process.) Backing up the persistence volume when it is "locked" means that I don't have to worry quite as much about the trustworthiness of the backup utility or the strength of its crypto. (Which is not to imply that I see a problem with either; they both look fine to me.)

FYI, after backing up with Clonezilla's GUI, it displays the 'partclone' commandline that it uses behind the curtain. That might be useful if someone wants to integrate a backup GUI into Tails.

Clonezilla is in Debian stable, so an alternative workflow might look something like:

- Boot without persistence but with an admin password,
- Install clonezilla with apt,
- Mount an external storage device, and
- Use clonezilla (or the corresponding 'partclone' commandline) to make a backup of the (locked) persistence volume.

The above would be more work to restore, unless someone builds a Tails GUI for it, but would obviate the need for users to plug their Tails USB sticks into their devices while those devices are running from a non-Tails LiveUSB.

Why Clonezilla rather than Gnome Disks?

Because at least one of the following things are true:

1. I do not understand how LUKS works and have observed a supernatural phenomenon; and/or
2. Encrypted LUKS volumes expose information about used vs unused space, and 'partclone' is smart enough to backup only the used space.

In other words (assuming the latter is true), Clonezilla can make "sparse" disk image backups of encrypted volumes. Say, for example, I have a 64GB USB stick with an 8GB Tails volume and a 56GB persistence volume. And suppose that persistence volume contains 10 GB of content. With Clonezilla, a full-disk backup image requires at most 18 GB, whereas (in my limited experience) 'dd' and Gnome Disks backups require 64 GB. (Maybe Gnome Disks can do this, as well, and I just don't know how? Assuming of course it is theoretically possible and I'm not just confused.) Because I am backing up my images to a second volume on my (256 GB) Clonezilla LiveUSB, this feature is quite valuable for me.

FWIW, a colleague and I have both tested this, and the restored images appear to work fine.

How I created my Clonezilla LiveUSB on Tails

Warning: requires the installation of 'libc6-i386'

1. Install dependencies: 
```
sudo apt-get install libc6-i386
```
2. [Download the .zip archive](https://clonezilla.org/downloads/download.php?branch=stable)
    - File type: zip
    - Repository: Sourceforge
    - Click *Problems downloading?*
    - Click the *direct link*
3. [Setup the LiveUSB](https://clonezilla.org/liveusb.php) on a 256GB USB stick
    - Use the *Disks* application to create two partitions:
        1. *clonezilla*: A 250MB FAT partition for Clonezilla (the instructions say 200MB, but that is too small)
        2. *backups*: Use the remaining space for a second (EXT4) partition to hold backups
    - Mount the *clonezilla* partition using *Disks*
    - Extract the `.zip` and copy contents to the root of the *clonezilla* partition. 
        - *Extract to* in the file browser would create an extra folder
        - The archive includes a hidden `.disk` folder that must be included
    - Use *Disks* to find the address of the *clonezilla* partition. (We use `/dev/<partition1>` below.)
    - Make it bootable:
    ```
    cd /media/amnesia/clonezilla/utils/linux
    sudo bash makeboot.sh /dev/<partition1>
    ```
    - Answer `y` to all prompts
    - Unmount the *clonezilla* volume
    - Remove the USB stick

#7 Updated by sajolida over 1 year ago

Answering to intrigeri:

IIRC yes: udisks will align partitions it creates on physical blocks, which greatly improves performance (particularly on flash media storage).

Ok.

As long as they create the 2nd partition using udisks (e.g. via GNOME Disks), it should be properly aligned, which means that in some cases there will be a bit of free space between the system partition and the 2nd one.

I tried again and my results disagree. See screenshots in attachment:

  1. persistence configuration.png is what I get when I go through the
    persistence configuration.
  2. removed partition.png is what I get after removing the partition
    created using the persistence configuration.
  3. gnome disks.png is what I get when creating a partition on the free
    space using GNOME Disks.

How could I investigate a bit more the low level operations behind the two?

#8 Updated by sajolida over 1 year ago

Hey blue9! Good to see you here!!!

Backing up the persistence volume when it is "locked" means that I don't have to worry quite as much about the trustworthiness of the backup utility or the strength of its crypto.

I also like that. The downsides are:

  • You have to boot specifically with the persistence locked and cannot
    work with it meanwhile.
  • You cannot get fancy backup features like increments (to save time),
    compression (to save disk space), etc.

So I don't want this as a long term goal but it might be useful for now.

FYI, after backing up with Clonezilla's GUI, it displays the 'partclone' commandline that it uses behind the curtain.

I tested Clonezilla and partclone. See my results below.

Clonezilla is in Debian stable, so an alternative workflow might look something like:

- Boot without persistence but with an admin password,
- Install clonezilla with apt,
- Mount an external storage device, and
- Use clonezilla (or the corresponding 'partclone' commandline) to make a backup of the (locked) persistence volume.

Right. It's the same workflow as if you use GNOME Disks.

In other words (assuming the latter is true), Clonezilla can make "sparse" disk image backups of encrypted volumes. Say, for example, I have a 64GB USB stick with an 8GB Tails volume and a 56GB persistence volume. And suppose that persistence volume contains 10 GB of content. With Clonezilla, a full-disk backup image requires at most 18 GB, whereas (in my limited experience) 'dd' and Gnome Disks backups require 64 GB. (Maybe Gnome Disks can do this, as well, and I just don't know how? Assuming of course it is theoretically possible and I'm not just confused.) Because I am backing up my images to a second volume on my (256 GB) Clonezilla LiveUSB, this feature is quite valuable for me.

I tested backing up a 3.5 GB persistence.

In terms of execution time, partclone.dd, dd, and GNOME Disks are equivalent and the checksum of the resulting image is the same.
Clonezilla is a bit slower:

  • Disks: 111 s
  • Clonezilla: 138 s
    • partclone.dd -z 10485760 -N -L /var/log/partclone.log -s /dev/sdc2 --output - | pigz -c --fast -b 1024 -p 16 --rsyncable | split -a 2 -b 4096MB - /home/partimag/2018-07-29-12-img/sdc2.dd-img. 2> /tmp/split_error.rc2Gz1
  • partclone.dd: 112 s
    • time partclone.dd --source /dev/sdc2 --output /media/amnesia/backups/partclone.img
  • dd: 114 s
    • time dd if=/dev/sdc2 of=/media/amnesia/backups/dd.img bs=16M

I understand that Clonezilla is slower because it pipes partclone.dd into the pigz compression program (funny name!). That's also probably why you are saying that the resulting images are smaller (in size) but I bet that the process is not faster and that it's copying everything in the partition.

Do you mind having a closer look at the partclone backend used in your Clonezilla command? Is it using something else than partclone.dd and doing smart things to avoid copying the whole partition in the first place? Or is the resulting image only smaller because of compression?

Also, I found the interface and interactions of Clonezilla completely obscure. It took me ~30 min to get it running, I really didn't felt confident (I thought it might be erasing my backup disk as it was flashing for no reason!), and I still don't get why it requires all these weird disk operations...

I'm attaching two screenshots of this intense experience:

  • partimag.png: Clonezilla scans for all your disks and proposes you to mount itself one of the partitions as /home/partimag. If you plug in and mount your external hard disks yourself before running Clonezille, it won't appear in the list :)
  • partclone.png: after spending ~20 min figuring out how to do my copy, I had to decypher a command line output to understand that the first run crashed because partclone is not installed (it's only a recommend in Debian). Clonezille bounces you several time between ncurses and the command line throughout the process.

So, as I understand it, the only pro of Clonezilla is that it compresses the disk image. But it's slower, not included in Tails, and has a completely obscure UX.

In our scenario, I think that technical knowledge and time are much more scarce resources than storage space (after all you could live with only one backup of the same size than you original persistence).

Summary: I'm still in favor of documenting GNOME Disks.

#9 Updated by blue9 over 1 year ago

Thanks Sajolida!

I found the interface and interactions of Clonezilla completely obscure. It took me ~30 min

Fair point. The ncurses interface is quite awful. It doesn't bother me as a special-purpose LiveOS, but it's a bit shocking when you launch it as an application. I apologise. You will never get those 30 minutes of your life back, and I blame myself. :)

For a quick-and-easy graphical walkthrough (that doesn't require developing your own "Tails Backup" GUI), Gnome Disks is probably the right answer. But I still think the commandline has potential, so...

I tested backing up a 3.5 GB persistence.

How much data was in it? And did you see any difference in the size of the resulting backup? The difference is obviously more prominent with larger (and mostly empty) persistence volumes. Just curious if you saw any difference at all.

Do you mind having a closer look at the partclone backend used in your Clonezilla command?

I will do this and send in the exact commandline later on today.

Is it using something else than partclone.dd and doing smart things to avoid copying the
whole partition in the first place? Or is the resulting image only smaller because of
compression?

I do not think this is just compression. Or, if it is, then it's compression that takes advantage of Clonezilla's ability to "do...smart things." In other words, pigz might be responsible for truncating a bunch of zeros, but partclone is responsible for giving zeros to pigz rather than giving it a bunch of random-looking, uncompressable, encrypted zeros.

I say this because:

  1. Encrypted data cannot be compressed; and
  2. The difference is not just a few GB (a lightly used 64GB Tails disk will end up being, like, 4 GB; with Gnome Disks, I assume it will be 64 GB).

After a year of using Tails to store data sets, some large git repos, lots of screenshots, a few .ISOs and other assorted files, my 50GB persistence folder is filling up quickly. This means my backups are getting bigger, but it was nice, in the early days, that I could store many small, sub-10GB backups rather than filling up my entire backup USB with a single image.

#10 Updated by segfault over 1 year ago

  • Related to Feature #12214: Document a way to manually backup persistent data added

#11 Updated by segfault over 1 year ago

I just want to make sure that you are aware that we have a documentation draft for manually backing up persistent data (it waits for review since 6 months, see #12214): https://labs.riseup.net/code/attachments/2096/Backup%20Documentation.png

It does not document creating a disk image, but copying the persistent data to a LUKS encrypted device, since we already have documentation for creating and using LUKS encrypted devices: https://tails.boum.org/doc/encryption_and_privacy/encrypted_volumes/index.en.html

#12 Updated by intrigeri over 1 year ago

  • Assignee changed from intrigeri to sajolida

As long as they create the 2nd partition using udisks (e.g. via GNOME Disks), it should be properly aligned, which means that in some cases there will be a bit of free space between the system partition and the 2nd one.

I tried again and my results disagree.

Indeed, I was misremembering! I don't know if/how udisks2 aligns stuff and am too lazy to check now. I suspect the size of your system partition is a multiple of the unit udisks2 aligns on, which is why there's no free space. But I don't think we have any guarantee that all system partitions our tools/doc have ever created satisfy this so there's probably Tails sticks with some free space between the 2 partitions in the wild.

We align on 2 MiB blocks ourselves: https://git-tails.immerda.ch/persistence-setup/tree/lib/Tails/Persistence/Setup.pm#n254. I think it's a good thing to do on our side; I went for 2 MiB because that's the largest physical block size I'm aware of on Flash media; for most USB sticks the alignment unit could be smaller but I preferred erring on the safe side. The good news is that it means we're creating a slightly smaller persistent volume than we could, which increases the odds that it can be restored on another USB stick that in theory has the same size but can very well be a bit smaller.

Note that I don't understand "If we didn't had this space people could create an empty partition instead of having to create a dummy partition so that it ends up at the right place". What's "the right place"? As long as the partition is created with udisks2 (e.g. GNOME Disks) — which will align stuff correctly I think — and is large enough so that the user can dump their backup on it, what's the problem?

#13 Updated by sajolida over 1 year ago

You will never get those 30 minutes of your life back, and I blame myself. :)

He he, don't worry. I also learn from bad designs :)

I tested backing up a 3.5 GB persistence.

How much data was in it?

2.7 GB.

And did you see any difference in the size of the resulting backup?

Seeing that dd and the partclone.dd command under Clonezilla produce
exactly the same binary (same checksum), I conclude that the content of
the persistence doesn't impact the size of the uncompressed disk image.

Clonezilla does some compression and I guess that's where you gain in
size on partition that are not full. See below.

Do you mind having a closer look at the partclone backend used in your Clonezilla command?

I will do this and send in the exact commandline later on today.

Ok.

I do not think this is just compression. Or, if it is, then it's compression that takes advantage of Clonezilla's ability to "do...smart things." In other words, pigz might be responsible for truncating a bunch of zeros, but partclone is responsible for giving zeros to pigz rather than giving it a bunch of random-looking, uncompressable, encrypted zeros.

Remember that I go the exact same binary from dd and partclone.dd,
so least my partclone is not doing smart things. But maybe your
partclone command is different.

I say this because:

  1. Encrypted data cannot be compressed; and
  2. The difference is not just a few GB (a lightly used 64GB Tails disk will end up being, like, 4 GB; with Gnome Disks, I assume it will be 64 GB).

My guess is that the difference comes from zeroes that remain at the end
of the partition on the space that has never been used yet. I also guess
that once all the space has been used and freeded, it doesn't have
zeroes anymore and won't compress well.

After a year of using Tails to store data sets, some large git repos, lots of screenshots, a few .ISOs and other assorted files, my 50GB persistence folder is filling up quickly. This means my backups are getting bigger, but it was nice, in the early days, that I could store many small, sub-10GB backups rather than filling up my entire backup USB with a single image.

I can definitely understand that!

I guess that would be the challenge for "Tails Backup" GUI: do backups
and recovery faster, save storage space, allow inspection of increments,
etc. But I don't think we can get that with disk images.

#14 Updated by sajolida over 1 year ago

I just want to make sure that you are aware that we have a documentation draft for manually backing up persistent data (it waits for review since 4 months, see #12214): https://labs.riseup.net/code/attachments/2096/Backup%20Documentation.png

I'm aware for that. Sorry for the huge delay. The Technical Writing team
hasn't been super productive since April but it's on our radar.

#15 Updated by sajolida over 1 year ago

But I don't think we have any guarantee that all system partitions our tools/doc have ever created satisfy this so there's probably Tails sticks with some free space between the 2 partitions in the wild.

Ok.

[...] The good news is that it means we're creating a slightly smaller persistent volume than we could, which increases the odds that it can be restored on another USB stick that in theory has the same size but can very well be a bit smaller.

This matches what I've experienced.

Note that I don't understand "If we didn't had this space people could create an empty partition instead of having to create a dummy partition so that it ends up at the right place". What's "the right place"? As long as the partition is created with udisks2 (e.g. GNOME Disks) — which will align stuff correctly I think — and is large enough so that the user can dump their backup on it, what's the problem?

D'oh! I don't know why I thought that not having the restored partition at the same exact place than the original one would be a problem. Since the user will first create a dummy partition the partition table will know where to look for it. Thanks for enlightening me!

I think that now the only thing stopping us from documenting this is to check whether the Clonezilla of blue9 is also using partclone.dd or something else that does smarter things.

So I'm reassigning to him.

#16 Updated by sajolida over 1 year ago

  • Assignee changed from sajolida to blue9

blue9: I'm giving your the "Contributor" role on Redmine so you can assign ticket to yourself (and we can assign tickets to you).

#17 Updated by blue9 over 1 year ago

Do you mind having a closer look at the partclone backend used in your Clonezilla command?

I will do this and send in the exact commandline later on today.

Sorry for the delay. Travel... Here is the command that Clonezilla stitches together (and displays while waiting for confirmation to start the backup):

/usr/sbin/ocs-sr -q2 -c -j2 -z1p -i 4096 -sfsck -enc -p choose savedisk 2018-08-03-img-t-3.9-testing sdc

Breaking that down...

  • /usr/sbin/ocs-sr, Commandline tool installed along with Clonezilla (seems to rely on partclone, partimage, dd and perhaps other utilities)
  • -q2, Use partclone to save partition(s) (i.e. partclone > partimage > dd).
  • -c, Wait for confirmation before saving or restoring.
  • -j2, Use dd to clone the image of the data between MBR (1st sector, i.e. 512 bytes) and 1st partition, which might be useful for some recovery tool.
  • -z1p, Compress using parallel gzip program (pigz) when saving: fast and small image file, good for multi-core or multi-CPU machine.
  • -i 4096, Set the size in MB to split the partition image file into multiple volumes files.
  • -sfsck, Presumably: do not run fsck before backing up the disk.
  • -enc, Encrypt the image.
  • -p choose, When save/restoration finishs, choose action in the client, poweroff, reboot (default), in command prompt or run CMD
  • savedisk, Backup the full disk rather than (a) specific partition(s).
  • 2018-08-03-img-t-3.9-testing, Destination file.
  • sdc, Source disk.

(The ocs-sr manpage is broken, but you can see all options for saving and restoring with sudo /usr/sbin/ocs-sr --help.)

If I understand correcty, the command you tested was:

  • partclone.dd -z 10485760 -N -L /var/log/partclone.log -s /dev/sdc2 --output - | pigz -c --fast -b 1024 -p 16 --rsyncable | split -a 2 -b 4096MB - /home/partimag/2018-07-29-12-img/sdc2.dd-img. 2> /tmp/split_error.rc2Gz1

Did you extract this commandline from Clonezilla? (I thought it displayed the underlying partclone command, but now I'm only seeing the ocs-sr line above.) partclone.dd is Clonezilla's fallback option if it can't figure out what filesystem to use:

...you can clone GNU/Linux, MS windows, Intel-based Mac OS, FreeBSD, NetBSD, OpenBSD, Minix, VMWare ESX and Chrome OS/Chromium OS, no matter it's 32-bit (x86) or 64-bit (x86-64) OS. For these file systems, only used blocks in partition are saved and restored by Partclone. For unsupported file system, sector-to-sector copy is done by dd in Clonezilla.

I thought perhaps it was able to use something smarter for LUKS volumes, but if you got that commandline from Clonezilla, then I guess not. Maybe the difference was the fact that you only had 800MB free, so no empty "chunks" were sent to pigz?

My guess is that the difference comes from zeroes that remain at the end of the partition on the space that has never been used yet. I also guess that once all the space has been used and freeded, it doesn't have zeroes anymore and won't compress well.

Seems likely. I've been meaning to test a backup of a restored image. Given that restoring an empty persistence volume seems to take just as long as restoring a full one, I'm guessing it fills in the (formerly) unused space on restore. Which would make subsequent backups full-sized.

In other words, this is kind of a best-case optimisation. But, like I say, it served me well for a year, so it may be a relatively common best-case...

#18 Updated by sajolida over 1 year ago

  • Status changed from Confirmed to Resolved
  • Assignee deleted (blue9)
  • Target version set to Tails_3.9
  • QA Check deleted (Info Needed)

Sorry for the delay.

Don't worry, things at Tails tend to go slowly but surely.

/usr/sbin/ocs-sr -q2 -c -j2 -z1p -i 4096 -sfsck -enc -p choose savedisk 2018-08-03-img-t-3.9-testing sdc

Thanks a lot for doing more tests!

If I understand correcty, the command you tested was:

  • partclone.dd -z 10485760 -N -L /var/log/partclone.log -s /dev/sdc2 --output - | pigz -c --fast -b 1024 -p 16 --rsyncable | split -a 2 -b 4096MB - /home/partimag/2018-07-29-12-img/sdc2.dd-img. 2> /tmp/split_error.rc2Gz1

Did you extract this commandline from Clonezilla?

I got it from the output before it switches to the ncurses progress bars. See partclone.png in attachment.

I thought perhaps it was able to use something smarter for LUKS volumes, but if you got that commandline from Clonezilla, then I guess not.

I run your command, removed '-enc' and '-sfck' which were not working for some reason, and got a partclone.dd command like mine:

partclone.dd -z 10485760 -N -L /var/log/partclone.log -s /dev/sdc2 --output - | pigz -c --fast -b 1024 -p 16 --rsyncable | split -a 2 -b 4096MB - /home/partimag/2018-08-03-img-t-3.9-testing/sdc2.dd-img. 2> /tmp/split_error.zJ70Aq

Maybe the difference was the fact that you only had 800MB free, so no empty "chunks" were sent to pigz?

I think the difference comes from the fact that my USB stick probably had no zeros on its surface as it had been entirely filled up and then emptied.

Given that restoring an empty persistence volume seems to take just as long as restoring a full one, I'm guessing it fills in the (formerly) unused space on restore. Which would make subsequent backups full-sized.

The restored partition would be exactly the same as the original partition. So if the original partition had (compressed) zeros at the end, the restored partition will have them too.

In other words, this is kind of a best-case optimisation. But, like I say, it served me well for a year, so it may be a relatively common best-case...

Understood. But given that we have to choose between simplicity and saving some disk space sometimes, I really prefer choosing simplicity.

Also, Clonezilla live might not behave exactly like Tails on some hardware, BIOS, UEFI, etc. configuration. So relying on Tails will also avoid hardware compatibility issues.

So I'm done with this ticket: GNOME Disks works great and Clonezilla is not better for us.

Also available in: Atom PDF