Project

General

Profile

Bug #12689

gpg --recv-key often hangs due to unreliable keyserver

Added by dachary over 2 years ago. Updated 4 days ago.

Status:
Fix committed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
06/13/2017
Due date:
% Done:

100%

Feature Branch:
bugfix/12689-more-reliable-OpenPGP-keyserver
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

description

    gpg --recv-key "2224 5C81 E3BA EB41 38B3 6061 310F 5612 00F4 AD77" 

but it hangs and timesout.

Steps to Reproduce

  • Install https://tails-dl.urown.net/tails/stable/tails-amd64-3.0/tails-amd64-3.0.iso with persistence and boot with an admin password.
  • Connect to internet (via wifi, no firewall)
  • Verify connectivity with sudo apt-get update
  • Open a terminal
  • $ cat /home/amnesia/.gnupg/dirmngr.conf
    use-tor
    keyserver hkp://jirk5u4osbsr34t5.onion
  • gpg --debug-all --recv-key "2224 5C81 E3BA EB41 38B3 6061 310F 5612 00F4 AD77"

Expected Behavior

The key is imported.

Actual Behavior


gpg: reading options from '/home/amnesia/.gnupg/gpg.conf'
gpg: enabled debug flags: packet mpi crypto filter iobuf memory cache memstat trust hashing ipc clock lookup extprog
gpg: DBG: [not enabled in the source] start
gpg: DBG: chan_3 <- # Home: /home/amnesia/.gnupg
gpg: DBG: chan_3 <- # Config: /home/amnesia/.gnupg/dirmngr.conf
gpg: DBG: chan_3 <- OK Dirmngr 2.1.18 at your service
gpg: DBG: connection to the dirmngr established
gpg: DBG: chan_3 > GETINFO version
gpg: DBG: chan_3 <
D 2.1.18
gpg: DBG: chan_3 <- OK
gpg: DBG: chan_3 > KS_GET - 0x22245C81E3BAEB4138B36061310F561200F4AD77
gpg: keyserver receive failed: Connection timed out

Related issues

Related to Tails - Feature #12210: Deal with automated tests of onion services vs Chutney Confirmed 02/03/2017
Related to Tails - Bug #17169: Seahorse can't sync keys with keyservers: Request Entity Too Large Confirmed
Duplicated by Tails - Feature #16575: Use a more reliable OpenPGP key server by default Duplicate 03/19/2019
Duplicated by Tails - Bug #17090: Use keys.openpgp.org as the default key server Duplicate
Blocks Tails - Bug #14770: "Fetching OpenPGP keys" scenarios are fragile: communication failure with keyserver Fix committed 10/04/2017
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

Associated revisions

Revision dbfbfa7b (diff)
Added by intrigeri 28 days ago

Use keys.openpgp.org's Onion service as the default keyserver (refs: #12689, #14770)

For background, see #12689 and its various duplicates. The short version is:

- Unfortunately, hkp://jirk5u4osbsr34t5.onion is way too unreliable.
- Most non-tech-savvy OpenPGP users don't use keyservers at all,
so this change should not affect them much.
- Tech-savvy OpenPGP users who want to use the Web-of-Trust (which
keys.openpgp.org's design essentially kills) should be able
to switch to a keyserver of their choosing, that includes
non-self certifications.

Let's use the Onion service instead of hkps://keys.openpgp.org/, so that we
don't lose end-to-end encryption and authentication of the keyserver in
Seahorse, which doesn't support hkps://. Alternatively, we could use
hkps://keys.openpgp.org/ everywhere else, but it feels simpler to use the same
keyserver everywhere.

At this point, the only Tails systems that are affected by this change are those
run without GnuPG persistence, and newly created persistent GnuPG configuration.
Pre-existing persistent GnuPG configuration is not updated (yet).

On the test suite front:

- This commit keeps the Chutney-based redirector setup as-is, except it will
proxy requests to keys.openpgp.org, instead of pool.sks-keyservers.net
previously. This should work as long as keys.openpgp.org supports cleartext
communication on port 11371.
- In theory, our long-term plan is to replace this with a local mock keyserver
Onion service. We'll see if that's still worth the effort once we redirect
requests to a more reliable upstream keyserver.
- I'm removing the @fragile tag for torified_gnupg.feature. There might
be other reasons why these scenarios are fragile; let's learn about them.

Revision 1e21ac60 (diff)
Added by intrigeri 27 days ago

Test suite: use current dkg's key (refs: #12689)

Fetching the old key from keys.openpgp.org fails with:

gpg: key 0xCCD2ED94D21739E9: new key but contains no user ID - skipped
gpg: Total number processed: 1
gpg: w/o user IDs: 1

Revision d3ca0a38 (diff)
Added by intrigeri 27 days ago

Test suite: start adjusting for keys.openpgp.org (refs: #12689)

Non-self certifications are not served by this new keyserver, so we can't test
Seahorse's keys synchronization features via the number of signatures on a key
anymore, especially since GnuPG offers no way to delete self-certifications in
batch mode.

So let's instead test this feature via the number of subkeys.

Revision 6e146b1a (diff)
Added by intrigeri 27 days ago

Test suite: ensure dirmngr uses IPv4 since our CI runs on an IPv4-only infrastructure (refs: #12689)

Revision 206d1d7e (diff)
Added by intrigeri 27 days ago

Test suite: switch backend keyservers (refs: #12689, #14770)

First, we do have to do something here, as long as we use an Onion service as
our default keyserver: our Chutney is not able to connect to Onion
services (#12210).

1. dirmngr: use keys.openpgp.org's Onion service directly =========================================================

That is, without going through a Chutney-based Onion service.

When dirmngr connects to the Onion service run by Chutney, the isotester
redirects the connection to keys.openpgp.org:11371; so far, so good. But then,
keys.openpgp.org redirects us to https://keys.openpgp.org, and the key retrieval
fails for some reason I don't fully understand:

dirmngr[10130]: connection from process 10145 (1000:1000)
dirmngr[10130]: DBG: chan_5 <- GETINFO version
dirmngr[10130]: DBG: chan_5 > D 2.2.12
dirmngr[10130]: DBG: chan_5 -> OK
dirmngr[10130]: DBG: chan_5 <
KS_GET -- 0xC4BC2DDB38CCE96485EBE9C2F20691179038E5C6
dirmngr[10130]: DBG: gnutls:L3: ASSERT: ../../../lib/x509/common.c[_gnutls_x509_get_raw_field2]:1570
dirmngr[10130]: DBG: gnutls:L3: ASSERT: ../../../lib/x509/x509.c[gnutls_x509_crt_get_subject_unique_id]:3897
dirmngr[10130]: DBG: gnutls:L3: ASSERT: ../../../lib/x509/x509.c[gnutls_x509_crt_get_issuer_unique_id]:3947
dirmngr[10130]: DBG: gnutls:L3: ASSERT: ../../../lib/x509/dn.c[_gnutls_x509_compare_raw_dn]:990
dirmngr[10130]: number of system provided CAs: 128
dirmngr[10130]: DBG: gnutls:L5: REC[0x78432c2d1e10]: Allocating epoch #0
dirmngr[10130]: DBG: gnutls:L2: added 6 protocols, 29 ciphersuites, 18 sig algos and 9 groups into priority list
dirmngr[10130]: URL 'http://nl5vtjfpfz2llza7.onion:5858/pks/lookup?op=get&options=mr&search=0xC4BC2DDB38CCE96485EBE9C2F20691179038E5C6' redirected to 'https://keys.openpgp.org/pks/lookup?op=get&options=mr&search=0xC4BC2DDB38CCE96485EBE9C2F20691179038E5C6' (301)
dirmngr[10130]: DBG: gnutls:L5: REC[0x78432c2d1e10]: Start of epoch cleanup
dirmngr[10130]: DBG: gnutls:L5: REC[0x78432c2d1e10]: End of epoch cleanup
dirmngr[10130]: DBG: gnutls:L5: REC[0x78432c2d1e10]: Epoch #0 freed
dirmngr[10130]: command 'KS_GET' failed: Forbidden &lt;Unspecified source&gt;
dirmngr[10130]: DBG: chan_5 > ERR 251 Forbidden &lt;Unspecified source&gt;
dirmngr[10130]: DBG: chan_5 <
BYE
dirmngr[10130]: DBG: chan_5 -> OK closing connection

dirmngr has a rather complex history wrt. TLS certificate validation vs.
HTTP redirections vs. DNS resolution, so I'm not surprised that this fails.

The code I'm disabling here was introduced for #12211. As I understand it, the
main goal there was to adjust our OpenPGP test cases to the fact we had switched
Tails to a .onion keyserver. It's not 100% clear to me why we added the layer of
indirection I'm removing here, instead of doing something similar to this
commit: I see no explanation on the ticket nor in the corresponding
commit messages.

So let's drop the extra complexity of going through a proxy Onion service on
Chutney, and instead reconfigure dirmngr to directly connect to
keys.openpgp.org. AFAICT, given previously we were not testing the dirmngr
configuration we're shipping to our users either, the only test coverage we lose
here is: testing that dirmngr can connect to a Onion service.

2. Seahorse: use a keyserver that meets our (so far implicit) requirements ==========================================================================

At least one member of pool.sks-keyservers.net does not satisfy the implicit
assumption that the code added for #12211 relied upon. This commit fixes that
by always using a backend, upstream keyserver that satisfies our assumptions.

Other than that, we stick to the previous setup here. This seems to be the only
vaguely viable option given we can't configure Seahorse to use keys.openpgp.org:
that keyserver redirects to HTTPS, which Seahorse does not support.

I'm (re-)tagging the Seahorse scenarios @fragile: in the end, it's not clear
whether this branch will improve things for those scenarios.

Revision b53b55c5 (diff)
Added by intrigeri 27 days ago

Drop broken Torbirdy pref to tweak the Enigmail keyserver (refs: #12689)

This pref is overriden by Torbirdy. Given it's not clear for how long we'll be
able to ship Torbirdy, I won't bother investigating, so for now Enigmail will
still use the old-school keyserver Onion service pool.

Partially reverts dbfbfa7b11857c0adba9b3c6cd38da3a26e2228b.

Revision c73634f6
Added by Sandro Knauß 4 days ago

Merge branch 'bugfix/12689-more-reliable-OpenPGP-keyserver' (Fix-committed: #12689)

History

#1 Updated by redshiftzero over 2 years ago

I also see this behavior, it looks like the issue is with the hkp server jirk5u4osbsr34t5.onion, e.g.:

gpg --keyserver hkp://pool.sks-keyservers.net --recv-key "2224 5C81 E3BA EB41 38B3 6061 310F 5612 00F4 AD77"

works.

#2 Updated by goupille over 2 years ago

  • Assignee set to goupille
  • Type of work changed from Debian to Research

I couldn't reproduce this issue with tails 3.0, could you try yourself ?

#3 Updated by dachary over 2 years ago

Today it works for me as well. I guess the server was down for a day or so ?

#4 Updated by intrigeri over 2 years ago

  • Status changed from New to Rejected
  • Assignee deleted (goupille)

Our automated test suite also has identified some on & off issues with that keyserver pool. We can keep track of this problem this way, and if it comes back more often we should come back here.

#5 Updated by dachary over 2 years ago

Is there a workaround for when the default server is down ? I suppose it is included by default because it is trusted. Are there other trusted key servers to be used when this one is not available ?

#6 Updated by intrigeri over 2 years ago

https://sks-keyservers.net/overview-of-pools.php

(But no, there's not really any "trusting" involved.)

#7 Updated by fowlslegs over 2 years ago

gpg --keyserver hkp://pool.sks-keyservers.net --recv-key "2224 5C81 E3BA EB41 38B3 6061 310F 5612 00F4 AD77"

fails on Tails 3 beta 4 with error message

gpg:keyserver receive failed: No keyserver available.

So at least for me @redshiftzero's workaround does not seem to work. That said the default keyserver is described as an "experimental Tor OnionBalance hidden service is running as hkp://jirk5u4osbsr34t5.onion consisting of the servers marked with Tor support in the status list as backend," which IMO Tails should re-consider using by default precisely because of it's experimental nature and the fact it doesn't seem highly reliable.

As @intrigeri correctly points out and as discussed further in https://github.com/freedomofpress/securedrop/pull/1804, there is not much "trusting" of keyservers. Would it not be better to simply go back to using the HKPS SKS keyservers, which seem to have greater uptime/ reliability? I'll leave that question up to the Tails team, but if I were part of it I think I would be arguing yes. As much as I like to use/ support use of onion services, use for keyservers/ keyserver pools combined with OnionBalance does in fact seem too experimental for inclusion in the default Tails GPG configuration.

#8 Updated by eloquence about 1 year ago

I still encounter the same hangs with the default configuration, even in Tails 3.9.1. This issue was rejected a year ago in favor of monitoring long term trends; would it be possible to get an update on what the automated test suite shows in terms of reliability of the default configuration? Apologies if this is already covered in another issue I missed in search.

#9 Updated by intrigeri about 1 year ago

  • Related to Bug #14770: "Fetching OpenPGP keys" scenarios are fragile: communication failure with keyserver added

#10 Updated by intrigeri about 1 year ago

  • Related to deleted (Bug #14770: "Fetching OpenPGP keys" scenarios are fragile: communication failure with keyserver)

#11 Updated by intrigeri about 1 year ago

  • Status changed from Rejected to Confirmed
  • Assignee set to emmapeel
  • QA Check set to Info Needed

I still encounter the same hangs with the default configuration,
even in Tails 3.9.1.

This issue was rejected a year ago in favor of monitoring long term trends; would it be possible to get an update on what the automated test suite shows in terms of reliability of the default configuration?

Sadly, we're not going to get the data I expected this way: we don't use this keyserver in our test suite anymore since we started using Chutney (#12211).
So to monitor long term trends, we'll need to rely on feedback from our users, such as your report ⇒ dear emmapeel, how frequently do users report issues with fetching OpenPGP keys from keyservers?

#12 Updated by intrigeri 8 months ago

  • Assignee deleted (emmapeel)
  • QA Check deleted (Info Needed)

As a user of this keyserver myself, I confirm it's not up and working all the time.

#13 Updated by intrigeri 8 months ago

  • Duplicated by Feature #16575: Use a more reliable OpenPGP key server by default added

#14 Updated by intrigeri 8 months ago

#16575 has more info and ideas.

#15 Updated by intrigeri 8 months ago

The "keyserver network dying" thread on the monkeysphere mailing list (started on 2019-04-02) suggests that not only the Onion keyservers pool has problems these days.

#16 Updated by intrigeri 3 months ago

  • Subject changed from gpg --recv-key hangs to gpg --recv-key often hangs due to unreliable keyserver

This is one of the main cause of test suite failures these days (#14770).

#17 Updated by intrigeri 3 months ago

  • Blocks Bug #14770: "Fetching OpenPGP keys" scenarios are fragile: communication failure with keyserver added

#18 Updated by intrigeri 3 months ago

#19 Updated by intrigeri 3 months ago

Buster 10.1 will have gnupg2 2.2.12-1+deb10u1. This includes one change that's relevant here: "use keys.openpgp.org as the default keyserver". Note, however, that this change won't affect Tails as-is, because we configure our own keyserver (git grep -F jirk5u4osbsr34t5.onion; and IIRC Torbirdy does the same for Enigmail); these settings can have been copied to persistent ~/.gnupg/dirmngr.conf.

#20 Updated by sajolida about 2 months ago

  • Related to Bug #17090: Use keys.openpgp.org as the default key server added

#21 Updated by intrigeri about 2 months ago

  • Duplicated by Bug #17090: Use keys.openpgp.org as the default key server added

#22 Updated by intrigeri about 2 months ago

  • Related to deleted (Bug #17090: Use keys.openpgp.org as the default key server)

#23 Updated by intrigeri 28 days ago

  • Status changed from Confirmed to In Progress
  • Assignee set to intrigeri

See bdcd73bfc5508dbcde552ade36f1db216d32f958 that removes the code to migrate keyservers in persistent GnuPG configuration. It could be useful, if/when we migrate to another keyserver than hkp://jirk5u4osbsr34t5.onion, if we want to forcibly migrate users' persistent config.

I'll give a quick try to switching to keys.openpgp.org's Onion service (which should be the cheapest way for us to use it, and preserves end-to-end encryption and authentication of the server in Seahorse, which doesn't support hkps://), and our test suite will tell us what kind of extra work is needed.

#24 Updated by intrigeri 28 days ago

  • Feature Branch set to bugfix/12689-more-reliable-OpenPGP-keyserver

#25 Updated by intrigeri 28 days ago

  • Related to Feature #12210: Deal with automated tests of onion services vs Chutney added

#26 Updated by intrigeri 27 days ago

  • Related to Bug #17169: Seahorse can't sync keys with keyservers: Request Entity Too Large added

#27 Updated by intrigeri 27 days ago

  • Status changed from In Progress to Needs Validation
  • Assignee deleted (intrigeri)
  • Target version set to Tails_4.0
  • Type of work changed from Research to Code

I've got something that I'm reasonably satisfied with:

  • Works when tested manually on the command line and in Seahorse.
  • Most affected test cases got more robust.

Caveats:

  • Enigmail still uses the old-school keyserver onion pool. See b53b55c599be81ccb8b1bd7b836c1c558caf3ac9 for the rationale.
  • I don't know if #17169 happens more often than it used to. This feature is disabled by default (and a bad idea in most cases) so I don't see it as a blocker.
  • I'm not updating persistent ~/.gnupg/dirmngr.conf automatically. Given keyservers are mostly used by technical users, IMO it'll be good enough to add a note about this in the release notes.

I realize it's very late in the 4.0 cycle to merge this so I'm happy to postpone to 4.1 if no reviewer has time to look at the branch, or if the reviewer prefers it this way :)

#28 Updated by intrigeri 27 days ago

  • Target version changed from Tails_4.0 to Tails_4.1

#29 Updated by hefee 4 days ago

  • Assignee set to hefee

#30 Updated by Anonymous 4 days ago

  • Status changed from Needs Validation to Fix committed
  • % Done changed from 0 to 100

#31 Updated by hefee 4 days ago

  • Assignee deleted (hefee)

Also available in: Atom PDF