Bug #15166
Tails Installer crashes when reporting errors whose message is not valid ASCII
50%
Description
This got reported to me by a Debian user running Tails Installer 4.4.18 with French locales:
2017-08-12 14:08:26,435 [creator.py:747 (mount_device)] DEBUG: Mounting /org/freedesktop/UDisks2/block_devices/sdb1 2017-08-12 14:08:26,591 [creator.py:774 (mount_device)] DEBUG: Mounted /dev/sdb1 to /media/user/Tails 2017-08-12 14:08:28,594 [creator.py:440 (delete_liveos)] INFO: Suppression du système d'exploitation live existant 2017-08-12 14:08:28,598 [creator.py:444 (delete_liveos)] DEBUG: Considering /media/user/Tails/EFI Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/tails_installer/gui.py", line 411, in on_start_clicked self.begin() File "/usr/lib/python2.7/dist-packages/tails_installer/gui.py", line 676, in begin self.live.delete_liveos() File "/usr/lib/python2.7/dist-packages/tails_installer/creator.py", line 467, in delete_liveos % {'message': str(e)}) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)
It looks like we assume ASCII encoding for the exception string representation, either in this string %-formatting operation or in gui.py
where we catch TailsInstallerError
and display its message.
For the record, in Tails Install 4.4.18 the corresponding code is:
self.log.debug("Considering " + path) if os.path.isfile(path): […] elif os.path.isdir(path): try: _set_liberal_perms_recursive(path) except OSError, e: self.log.debug(_("Unable to chmod %(file)s: %(message)s") % {'file': path, 'message': str(e)}) try: shutil.rmtree(path) except OSError, e: raise TailsInstallerError(_("Unable to remove directory from" " previous LiveOS: %(message)s") % {'message': str(e)})
… and line 467 is the last one I'm quoting above.
I see we have lots of other instances of very similar pieces of code. In most cases, this bug will not break Tails Installer functionnality, because the affected code is about reporting a critical error that already happened. But it will prevent users from sending us useful bug reports (by preventing the actual error from being part of the bug report) and will create poor UX (by crashing instead of displaying an explanatory error message).
Associated revisions
Enable the bugfix-15166-installer-crash APT overlay (refs: #15166).
Merge branch 'bugfix/15166-installer-crash' into stable (refs: #15166)
History
#1 Updated by intrigeri about 1 year ago
kurono, would you be interested in having a look?
#4 Updated by kurono 4 months ago
- Status changed from Confirmed to In Progress
- Assignee deleted (
kurono) - Target version set to Tails_3.11
- QA Check set to Ready for QA
- Feature Branch set to kurono/bug/15166-installer-crashes-no-ascii
I was able to reproduce this with a virtual machine and selecting the french language.
Then I manually raised an exception with
raise OSError(u'Suppression du système d\'exploitation live existant')
I introduced utf-8 encoding in the exceptions, and it seems to fix the problem.
It passed all the usb tests.
#6 Updated by intrigeri 4 months ago
I don't intend to block on the following concerns for this review'n'merge: the proposed code makes sense to me and I trust you that it does fix at least some instances of this bug.
Now, even if I merge this in a hurry for 3.10, I'll probably leave this ticket "In Progress" because I'm not sure the branch fixes all instances of the bug and I'm not convinced by the way the initially chosen approach to fix it.
I'm a bit confused wrt. what our logger should receive, i.e. a Python string or encoded bytes. E.g. I see self.log.debug(cmd_decoded)
which suggests a Python string is prefered. But your branch changes many (but not all) calls to log.{warning,error,debug,etc.}
so they're now passed encoded bytes, which seems consistent with the fact that our _
claims to return a "UTF-8 encoded bytestring". If our logger needs an encoded bytestring, fine, but then shouldn't self.live.log.exception(unicode(e))
be encoded to utf-8
too? Same for self.log.debug(cmd_decoded)
and — I suspect — a bunch of other similar calls.
More generally, I'm not sure that manually adding "turn into a Unicode string and then encode to utf-8
" everywhere is the best approach. I bet we'll always forget it occasionally. So I think we should automate this, probably by giving it a more clever stream handler or formatter in _setup_logger
, that will use something like the _to_unicode
and unicode_to_utf8
functions we already have, in order to support both all kinds of inputs.
What do you think?
#7 Updated by intrigeri 4 months ago
- % Done changed from 0 to 20
- Feature Branch changed from kurono/bug/15166-installer-crashes-no-ascii to bugfix/15166-installer-crash, kurono/liveusb-creator:bug/15166-installer-crashes-no-ascii
Released 5.0.12 with these changes, uploaded, enabled the corresponding APT overlay on a branch, asked Jenkins to test this.
#8 Updated by intrigeri 4 months ago
- Assignee changed from intrigeri to kurono
- Target version changed from Tails_3.10.1 to Tails_3.11
- QA Check deleted (
Ready for QA) - Feature Branch deleted (
bugfix/15166-installer-crash, kurono/liveusb-creator:bug/15166-installer-crashes-no-ascii)
Merged, I'm giving this ticket back to you for the next steps, as agreed on XMPP.
#9 Updated by kurono 3 months ago
- Assignee changed from kurono to intrigeri
- % Done changed from 20 to 30
- QA Check set to Ready for QA
- Feature Branch set to kurono/bug/15166-installer-crashes-no-ascii
intrigeri wrote:
I don't intend to block on the following concerns for this review'n'merge: the proposed code makes sense to me and I trust you that it does fix at least some instances of this bug.
Now, even if I merge this in a hurry for 3.10, I'll probably leave this ticket "In Progress" because I'm not sure the branch fixes all instances of the bug and I'm not convinced by the way the initially chosen approach to fix it.
I'm a bit confused wrt. what our logger should receive, i.e. a Python string or encoded bytes. E.g. I see
self.log.debug(cmd_decoded)
which suggests a Python string is prefered. But your branch changes many (but not all) calls tolog.{warning,error,debug,etc.}
so they're now passed encoded bytes, which seems consistent with the fact that our_
claims to return a "UTF-8 encoded bytestring". If our logger needs an encoded bytestring, fine, but then shouldn'tself.live.log.exception(unicode(e))
be encoded toutf-8
too? Same forself.log.debug(cmd_decoded)
and — I suspect — a bunch of other similar calls.More generally, I'm not sure that manually adding "turn into a Unicode string and then encode to
utf-8
" everywhere is the best approach. I bet we'll always forget it occasionally. So I think we should automate this, probably by giving it a more clever stream handler or formatter in_setup_logger
, that will use something like the_to_unicode
andunicode_to_utf8
functions we already have, in order to support both all kinds of inputs.What do you think?
Yes, as said before I agree it was not the best solution. It seems like the problem is not really with the logger, but with the encoding of the exception message, which can't be unicode at least one create a custom Exception that manages that. I have done it, and and have reverted the former changes, no the excetion are managed as str(e) again. It passed all the tests cases.
#10 Updated by intrigeri 3 months ago
It seems like the problem is not really with the logger, but with the encoding of the exception message, which can't be unicode at least one create a custom Exception that manages that.
I agree we have a problem with inconsistent decoded string vs. bytestring in exception handling and I'm glad you're on it! Your solution looks good to me. I'll test it in non-English locales in various situations where we raise an exception.
Wrt. the logger, I'm still a bit confused: could you please test & confirm that it works as expected regardless of whether we pass it a decoded string or a bytestring, in cases where it matters?
#12 Updated by kurono 3 months ago
intrigeri wrote:
It seems like the problem is not really with the logger, but with the encoding of the exception message, which can't be unicode at least one create a custom Exception that manages that.
I agree we have a problem with inconsistent decoded string vs. bytestring in exception handling and I'm glad you're on it! Your solution looks good to me. I'll test it in non-English locales in various situations where we raise an exception.
Wrt. the logger, I'm still a bit confused: could you please test & confirm that it works as expected regardless of whether we pass it a decoded string or a bytestring, in cases where it matters?
Ok, I think I understand now, our code works as intended with decoded strings and encoding applied to decoded unicode bytestring for instance the following works:
self.log.info(_(u"Suppression du système d\'exploitation live existant"))
self.log.info(_(u"Suppression du système d\'exploitation live existant").encode("utf-8"))
but the following does not:
s = "Suppression du système d\'exploitation live existant"
b = bytes(s, 'utf-8')
self.log.info(b)
and the normal string with unicode characters does not work:
self.log.info("Suppression du système d\'exploitation live existant")
I will take a look of all the instances when we call the logs to see if we are not doing it right.
#13 Updated by CyrilBrulebois 2 months ago
- Target version changed from Tails_3.11 to Tails_3.12
#14 Updated by kurono about 1 month ago
- Assignee changed from kurono to intrigeri
- QA Check changed from Info Needed to Ready for QA
and the normal string with unicode characters does not work:
[...]
Funny, I can't reproduce this again. So, I would say, as far as we don't use something like:
s = "Suppression du système d\'exploitation live existant"
b = bytes(s, 'utf-8')
self.log.info(b)
Our code is safe to manage log messages with Unicode characters.
#16 Updated by intrigeri about 7 hours ago
- Assignee changed from intrigeri to kurono
- % Done changed from 30 to 50
- QA Check changed from Ready for QA to Info Needed
Code review passes. Can you please send this to Jenkins so I can easily see test results? You'll need to export your Tails Installer changes to config/chroot_local-patches/
in tails.git.