Project

General

Profile

Bug #17154

Improve entropy gathering

Added by segfault about 1 month ago. Updated 18 days ago.

Status:
Confirmed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:

0%

Feature Branch:
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

The Linux kernel uses multiple sources of randomness to initialize its cryptographically secure pseudo-random number generator (CSPRNG). This includes various sources with dubious quality wrt. randomness: the kernel command-line, serial numbers, MAC addresses, timing information...

This is totally fine, because most of these sources are not credited as good/reliable entropy, which means that the values are mixed into the entropy pool, but they do not increase the entropy counter. (By default, the kernel currently only credits inter-interrupt timings and inter-keyboard timings).

When the entropy counter reaches a certain threshold (currently 512 bits, but it's currently being discussed on the kernel mailing list to reduce that to 256 bits) is the entropy pool marked as initialized.

Until the entropy pool is marked as initialized, reads from /dev/random and calls to the getrandom syscall are blocking, and reads from /dev/urandom return not-cryptographically secure random numbers.

If the entropy pool is seeded with predictable inputs, all of /dev/random, /dev/urandom, and getrandom return not-cryptographically secure random numbers.

Both Debian and Tails currently add additional sources which do increase the entropy counter. I would like to re-evaluate the use of those sources.


Related issues

Related to Tails - Feature #7102: Evaluate how safe haveged is in a virtualized environment Confirmed 04/17/2014
Related to Tails - Feature #5650: rngd Resolved
Related to Tails - Bug #17124: Install Linux 5.3 from sid Resolved

History

#1 Updated by segfault about 1 month ago

  • Description updated (diff)

#2 Updated by segfault about 1 month ago

In Tails we currently add two services which fill the entropy pool: haveged and rngd.

haveged implements the HAVEGE algorithm to gather randomness from CPU timings. It runs as a service in the userspace and fills the entropy pool immediately when it is started and keeps filling it if the kernel's entropy count falls low by reads from /dev/random¹.

¹ It doesn't really make sense that reading /dev/random reduces the entropy count. Once the CSPRNG is initialized with a good random seed, it can produce a lot of cryptographically secure random numbers. That is why kernel devs now deeply regret this behavior of /dev/random:
https://lore.kernel.org/linux-ext4/20190916170028.GA15263@mit.edu/

There are multiple issues with haveged:

  • The fact that it tries to use timing information from CPU instructions while running in userspace, thereby being subjected to the kernel's scheduler, which could impact the randomness of the timings [1]
  • The CPU instruction it uses (RDTSC) returns predictable results in some virtualized environments [2]
  • No one seems to know whether haveged actually provides any good randomness. AFAIK, it was never thoroughly analyzed by experts. The haveged tests which are supposed evaluate the produced randomness also pass if haveged is fed with a constant input instead of the CPU timings [3].

[1] https://twitter.com/mjg59/status/1181426468519383041
[2] https://tls.mbed.org/tech-updates/security-advisories/polarssl-security-advisory-2011-02
[3] http://jakob.engbloms.se/archives/1374

rngd uses the output from a hardware random number generator (hwrng), if any, to fill the entropy pool. There also issues with rngd:
  • First, it's pretty much obsolete. The by far most common hwrngs are the ones builtin in modern x86 processors. Those can be accessed via the RDRAND instruction. Since 4.19, the Linux kernel already supports seeding the entropy pool via that instruction by either compiling it with CONFIG_RANDOM_TRUST_CPU=y or starting it with the random.trust_cpu=on command-line option [4]. Since Stretch, Debian does compile the kernel with CONFIG_RANDOM_TRUST_CPU=y [5], so currently, the kernel already credits entropy from RDRAND in Tails. Granted, it's still possible that Tails is run on a system with a different hwrng, which is supported by rngd but not by the kernel.
  • We probably don't want to use RDRAND to seed the CSPRNG. It can't be independently audited, which means that you have to trust Intel that it (or a three-letter agency) did not install a backdoor [6][7]. That means that, from a security point of view, the best would be to remove rngd and add random.trust_cpu=off to the kernel command-line, to prevent the output from RDRAND to be credited to the entropy pool. Note that the kernel still mixes in output from RDRAND into the entropy pool in that case, it only doesn't credit it anymore, so we don't weaken our entropy with random.trust_cpu=off.

[4] https://outflux.net/blog/archives/2018/10/22/security-things-in-linux-v4-19/
[5] https://lists.debian.org/debian-devel/2019/02/msg00170.html
[6] https://lkml.org/lkml/2018/7/17/1279
[7] https://gist.github.com/mimoo/5957603f5aa5f0cded33e55f930644cb

#3 Updated by segfault about 1 month ago

(I started drafting a discussion of the UX impact of removing both rngd and haveged but won't finish that now)

#4 Updated by intrigeri about 1 month ago

  • Related to Feature #7102: Evaluate how safe haveged is in a virtualized environment added

#5 Updated by intrigeri about 1 month ago

#6 Updated by segfault about 1 month ago

  • Description updated (diff)

#7 Updated by segfault about 1 month ago

In Linux 5.4, the kernel will try to gather entropy itself via CPU timing noise (jitter), similar to what haveged is doing in the userspace [1]. The quality of the randomness produced by that is still debated (although the concerns are more about simpler CPU architectures, not x86) [2], and to me it seems like a pretty rushed decision, made by Linus himself. Anyway, if that patch gets released, it won't be our decision to make anymore whether to use jitter entropy or not.

[1] https://github.com/torvalds/linux/commit/50ee7529ec4500c88f8664560770a7a1b65db72b
[2] https://lore.kernel.org/lkml/20190930033706.GD4994@mit.edu/

I expect that on systems supported by Tails (64-bit x86), the jitter entropy generator will work quite well, so that even if we remove haveged and rngd, applications won't have to wait for a long time for the RNG to be initialized. We should test that once we can upgrade to Linux 5.4.

#8 Updated by cypherpunks 19 days ago

Oh dear, that patch for Linux looks horrible. I mean, more horrible than usual. There's already jitterentropy which can be used to inject randomness and is based on a detailed study about certain nondeterministic aspects of CPU behavior (although even it, like HAVEGE, is not great). Why would they not use that? Why would they try to create their own naive jitter entropy collector?

Anyway, I've mentioned this in past tickets, but I dislike the use of haveged the way it is used now. A better solution would be to have it in a cron job to periodically write to /dev/urandom, so that it doesn't issue the IOCTL that increases the entropy estimate with potentially dubious entropy (this only matters during early boot or after a state compromise, of course). I recall that the main reason why that idea was shot down was that GnuPG (foolishly) uses /dev/random instead of /dev/urandom and eats many kilobytes of data, which makes generating keys take a very long time. The cryptographic necessity of this can be trivially disproven with a simple look at the complexity of GNFS. I think libotr in Pidgin does that too?

Oh, and it's all moot anyway with kernel.random.read_wakeup_threshold=64 by default, which breaks catastrophic reseeding after state compromise. Perhaps I should open a new ticket here to change that default to 128? I digress.

#9 Updated by cypherpunks 19 days ago

segfault wrote:

The Linux kernel uses multiple sources of randomness to initialize its cryptographically secure pseudo-random number generator (CSPRNG). This includes various sources with dubious quality wrt. randomness: the kernel command-line, serial numbers, MAC addresses, timing information...

This is incorrect. The kernel uses add_device_randomness() for data on the kernel command line, serial numbers, MAC addresses, etc. This does not credit entropy. In fact, its purpose is not even to be unpredictable, merely to ensure that a worst-case scenario where there is no natural entropy will not result in a hundred embedded devices choosing the same UUIDs. The entropy pool is initialized only after sufficient interrupts occur (see source code for details). The predictable device randomness does not credit entropy bits at all.

As for timing information being of dubious quality, that's untrue as well. It is a major part of the BCP 106 recommendation for entropy collection. Timing information is taken for interrupts with add_interrupt_randomness(), and for other unpredictable events with add_timer_randomness() which itself is called in e.g. add_input_randomness(). These are completely unpredictable as long as input to the system is unpredictable, as with keystrokes and non-deterministic (due to air turbulence) behavior wrt hard drive actuator movements. The BSI paper on the Linux RNG gives more rationale.

tl;dr Those dubious sources you list aren't a problem because they aren't used to initialize the RNG state, and timing information is not a bad source of entropy as used by the Linux kernel.

#10 Updated by segfault 18 days ago

cypherpunks wrote:

segfault wrote:

The Linux kernel uses multiple sources of randomness to initialize its cryptographically secure pseudo-random number generator (CSPRNG). This includes various sources with dubious quality wrt. randomness: the kernel command-line, serial numbers, MAC addresses, timing information...

This is incorrect. The kernel uses add_device_randomness() for data on the kernel command line, serial numbers, MAC addresses, etc. This does not credit entropy. In fact, its purpose is not even to be unpredictable, merely to ensure that a worst-case scenario where there is no natural entropy will not result in a hundred embedded devices choosing the same UUIDs.

Did you read the sentence after the one you quoted? I said there that these sources don't get credited, and explain later that the entropy pool is only marked as initialized after enough entropy was credited.

The entropy pool is initialized only after sufficient interrupts occur (see source code for details). The predictable device randomness does not credit entropy bits at all.

That's exactly what I wrote in the description.

As for timing information being of dubious quality, that's untrue as well. It is a major part of the BCP 106 recommendation for entropy collection. Timing information is taken for interrupts with add_interrupt_randomness(), and for other unpredictable events with add_timer_randomness() which itself is called in e.g. add_input_randomness(). These are completely unpredictable as long as input to the system is unpredictable, as with keystrokes and non-deterministic (due to air turbulence) behavior wrt hard drive actuator movements. The BSI paper on the Linux RNG gives more rationale.

The concern is that it's not clear whether there was enough (or even any) unpredictable input at the time when the timing information is used. And I don't consider myself enough of an expert in this area to raise this concern, I'm just citing people I trust to have more expertise and whose arguments I find convincing.

#11 Updated by intrigeri 5 days ago

  • Related to Bug #17124: Install Linux 5.3 from sid added

Also available in: Atom PDF