Project

General

Profile

Feature #11788

Enable TRIM support on all SSD-backed storage

Added by intrigeri over 3 years ago. Updated 27 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
-
Category:
Infrastructure
Target version:
-
Start date:
09/10/2016
Due date:
% Done:

20%

Feature Branch:
Type of work:
Sysadmin
Blueprint:
Starter:
Affected tool:

Description

We should enable TRIM (aka. discard) on our servers, to avoid performance issues to pop up sooner or later.

lsblk --discard allows checking which block devices pass through TRIM commands.

History

#1 Updated by intrigeri almost 3 years ago

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

Puppet and other tools support for discard config & usage:

  • MD RAID: unclear whether MD forwards discard requests, RHEL 6.5 doc says it does but that might be thanks to their custom kernel
  • lvm.conf: augeas 1.3+ (in Stretch) supports it (/files/etc/lvm/lvm.conf/devices/dict/issue_discards/)
  • fstab: trivially supported by Puppet (unless we decide to go for fstrim instead)
  • crypttab: augeas seems to support it

#2 Updated by intrigeri over 2 years ago

  • Description updated (diff)

#3 Updated by intrigeri over 2 years ago

intrigeri wrote:

  • MD RAID: unclear whether MD forwards discard requests, RHEL 6.5 doc says it does but that might be thanks to their custom kernel

We'll see.

  • lvm.conf: augeas 1.3+ (in Stretch) supports it (/files/etc/lvm/lvm.conf/devices/dict/issue_discards/)

Enabled on all virtualization hosts.

  • fstab: trivially supported by Puppet (unless we decide to go for fstrim instead)

I think I'll go with /usr/share/doc/util-linux/examples/fstrim.{service,timer} instead of fiddling with tons of fstab entries.

  • crypttab: augeas seems to support it

Enabled on all LUKS volumes on all our virtualization hosts.

We'll see if that's enough to allow manual fstrim once these boxes have rebooted. If it works, then I'll enable the systemd service + timer.

#4 Updated by intrigeri over 2 years ago

  • % Done changed from 10 to 20

#5 Updated by intrigeri over 2 years ago

Added discard='unmap' to all libvirt guest definitions.

#6 Updated by intrigeri over 2 years ago

Actually virtio-blk does not support discard yet (https://wiki.qemu.org/ToDo/Block#virtio-blk_discard_support_.5BPeter_Lieven.5D) but some patches have been discussed on LKML a few months ago.

#7 Updated by intrigeri 11 months ago

intrigeri wrote:

Actually virtio-blk does not support discard yet

After reading https://mpolednik.github.io/2017/01/23/virtio-blk-vs-virtio-scsi/, it seems that the best thing to do (not only wrt. discard) is to switch to virtio-scsi. Unfortunately this will rename all drives (e.g. vda → sda) in VMs so it'll be a quite disruptive operation, that requires some carefully coordinated Puppet changes, initramfs & bootloader updates, and downtime. I say let's skip that until we notice actual performance decrease caused by the lack of discard support.

Next step: look at long-term I/O performance trends in 6-12 months to see if SSD performance decreases.

#8 Updated by intrigeri 10 months ago

  • Assignee deleted (intrigeri)

#9 Updated by intrigeri 9 months ago

intrigeri wrote:

Actually virtio-blk does not support discard yet

It does in QEMU 4.0, that we won't have in Buster yet, so that's for whenever we upgrade our virtualization host to Bullseye, i.e. probably in the 2nd half of 2021.

#10 Updated by intrigeri about 1 month ago

On my sid system, I have a version of QEMU that supposedly supports TRIM for virtio-blk, but lsblk --discard in the guest disagrees.

I've switched a VM to virtio-scsi and TRIM support does work. One minor caveat is that iothreads can only be assigned per-SCSI-controller, and not per disk, so to give each disk its own iothread, one needs 1 virtio-scsi controller per disk, and to assign each disk to its own controller.

#11 Updated by intrigeri 27 days ago

intrigeri wrote:

  • fstab: trivially supported by Puppet (unless we decide to go for fstrim instead)

I think I'll go with /usr/share/doc/util-linux/examples/fstrim.{service,timer} instead of fiddling with tons of fstab entries.

The recommended way these days is indeed fstrim.{service,time} and not the discard mount option. I've enabled fstrim.timer on all our systems: https://git.tails.boum.org/puppet-tails/commit/?id=28220dd0c78ceb115098e928aad3ea00e98df1a3

So the only remaining blocker here is the libvirt/QEMU storage layer.

Also available in: Atom PDF