Project

General

Profile

Bug #15166

Tails Installer crashes when reporting errors whose message is not valid ASCII

Added by intrigeri 11 months ago. Updated 6 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Installation
Target version:
Start date:
01/14/2018
Due date:
% Done:

30%

QA Check:
Info Needed
Feature Branch:
kurono/bug/15166-installer-crashes-no-ascii
Type of work:
Code
Blueprint:
Starter:
Affected tool:
Installer

Description

This got reported to me by a Debian user running Tails Installer 4.4.18 with French locales:

2017-08-12 14:08:26,435 [creator.py:747 (mount_device)] DEBUG: Mounting /org/freedesktop/UDisks2/block_devices/sdb1
2017-08-12 14:08:26,591 [creator.py:774 (mount_device)] DEBUG: Mounted /dev/sdb1 to /media/user/Tails 
2017-08-12 14:08:28,594 [creator.py:440 (delete_liveos)] INFO: Suppression du système d'exploitation live existant
2017-08-12 14:08:28,598 [creator.py:444 (delete_liveos)] DEBUG: Considering /media/user/Tails/EFI
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/tails_installer/gui.py", line 411, in on_start_clicked
    self.begin()
  File "/usr/lib/python2.7/dist-packages/tails_installer/gui.py", line 676, in begin
    self.live.delete_liveos()
  File "/usr/lib/python2.7/dist-packages/tails_installer/creator.py", line 467, in delete_liveos
    % {'message': str(e)})
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)

It looks like we assume ASCII encoding for the exception string representation, either in this string %-formatting operation or in gui.py where we catch TailsInstallerError and display its message.

For the record, in Tails Install 4.4.18 the corresponding code is:

            self.log.debug("Considering " + path)
            if os.path.isfile(path):
[…]
            elif os.path.isdir(path):
                try:
                    _set_liberal_perms_recursive(path)
                except OSError, e:
                    self.log.debug(_("Unable to chmod %(file)s: %(message)s")
                                   % {'file': path, 'message': str(e)})
                try:
                    shutil.rmtree(path)
                except OSError, e:
                    raise TailsInstallerError(_("Unable to remove directory from" 
                                         " previous LiveOS: %(message)s")
                                         % {'message': str(e)})

… and line 467 is the last one I'm quoting above.

I see we have lots of other instances of very similar pieces of code. In most cases, this bug will not break Tails Installer functionnality, because the affected code is about reporting a critical error that already happened. But it will prevent users from sending us useful bug reports (by preventing the actual error from being part of the bug report) and will create poor UX (by crashing instead of displaying an explanatory error message).

Associated revisions

Revision 0e172164 (diff)
Added by intrigeri about 2 months ago

Enable the bugfix-15166-installer-crash APT overlay (refs: #15166).

Revision 0ce2f62a
Added by intrigeri about 2 months ago

Merge branch 'bugfix/15166-installer-crash' into stable (refs: #15166)

History

#1 Updated by intrigeri 11 months ago

kurono, would you be interested in having a look?

#2 Updated by u 4 months ago

Ping?

#3 Updated by kurono 3 months ago

  • Assignee set to kurono

#4 Updated by kurono about 2 months ago

  • Status changed from Confirmed to In Progress
  • Assignee deleted (kurono)
  • Target version set to Tails_3.11
  • QA Check set to Ready for QA
  • Feature Branch set to kurono/bug/15166-installer-crashes-no-ascii

I was able to reproduce this with a virtual machine and selecting the french language.
Then I manually raised an exception with

raise OSError(u'Suppression du système d\'exploitation live existant')

I introduced utf-8 encoding in the exceptions, and it seems to fix the problem.
It passed all the usb tests.

#5 Updated by intrigeri about 2 months ago

  • Assignee set to intrigeri
  • Target version changed from Tails_3.11 to Tails_3.10.1

This bug seems pretty bad so I'll try to review'n'merge this in time for 3.10. Not sure that'll work out but let's try!

#6 Updated by intrigeri about 2 months ago

I don't intend to block on the following concerns for this review'n'merge: the proposed code makes sense to me and I trust you that it does fix at least some instances of this bug.

Now, even if I merge this in a hurry for 3.10, I'll probably leave this ticket "In Progress" because I'm not sure the branch fixes all instances of the bug and I'm not convinced by the way the initially chosen approach to fix it.

I'm a bit confused wrt. what our logger should receive, i.e. a Python string or encoded bytes. E.g. I see self.log.debug(cmd_decoded) which suggests a Python string is prefered. But your branch changes many (but not all) calls to log.{warning,error,debug,etc.} so they're now passed encoded bytes, which seems consistent with the fact that our _ claims to return a "UTF-8 encoded bytestring". If our logger needs an encoded bytestring, fine, but then shouldn't self.live.log.exception(unicode(e)) be encoded to utf-8 too? Same for self.log.debug(cmd_decoded) and — I suspect — a bunch of other similar calls.

More generally, I'm not sure that manually adding "turn into a Unicode string and then encode to utf-8" everywhere is the best approach. I bet we'll always forget it occasionally. So I think we should automate this, probably by giving it a more clever stream handler or formatter in _setup_logger, that will use something like the _to_unicode and unicode_to_utf8 functions we already have, in order to support both all kinds of inputs.

What do you think?

#7 Updated by intrigeri about 2 months ago

  • % Done changed from 0 to 20
  • Feature Branch changed from kurono/bug/15166-installer-crashes-no-ascii to bugfix/15166-installer-crash, kurono/liveusb-creator:bug/15166-installer-crashes-no-ascii

Released 5.0.12 with these changes, uploaded, enabled the corresponding APT overlay on a branch, asked Jenkins to test this.

#8 Updated by intrigeri about 2 months ago

  • Assignee changed from intrigeri to kurono
  • Target version changed from Tails_3.10.1 to Tails_3.11
  • QA Check deleted (Ready for QA)
  • Feature Branch deleted (bugfix/15166-installer-crash, kurono/liveusb-creator:bug/15166-installer-crashes-no-ascii)

Merged, I'm giving this ticket back to you for the next steps, as agreed on XMPP.

#9 Updated by kurono about 1 month ago

  • Assignee changed from kurono to intrigeri
  • % Done changed from 20 to 30
  • QA Check set to Ready for QA
  • Feature Branch set to kurono/bug/15166-installer-crashes-no-ascii

intrigeri wrote:

I don't intend to block on the following concerns for this review'n'merge: the proposed code makes sense to me and I trust you that it does fix at least some instances of this bug.

Now, even if I merge this in a hurry for 3.10, I'll probably leave this ticket "In Progress" because I'm not sure the branch fixes all instances of the bug and I'm not convinced by the way the initially chosen approach to fix it.

I'm a bit confused wrt. what our logger should receive, i.e. a Python string or encoded bytes. E.g. I see self.log.debug(cmd_decoded) which suggests a Python string is prefered. But your branch changes many (but not all) calls to log.{warning,error,debug,etc.} so they're now passed encoded bytes, which seems consistent with the fact that our _ claims to return a "UTF-8 encoded bytestring". If our logger needs an encoded bytestring, fine, but then shouldn't self.live.log.exception(unicode(e)) be encoded to utf-8 too? Same for self.log.debug(cmd_decoded) and — I suspect — a bunch of other similar calls.

More generally, I'm not sure that manually adding "turn into a Unicode string and then encode to utf-8" everywhere is the best approach. I bet we'll always forget it occasionally. So I think we should automate this, probably by giving it a more clever stream handler or formatter in _setup_logger, that will use something like the _to_unicode and unicode_to_utf8 functions we already have, in order to support both all kinds of inputs.

What do you think?

Yes, as said before I agree it was not the best solution. It seems like the problem is not really with the logger, but with the encoding of the exception message, which can't be unicode at least one create a custom Exception that manages that. I have done it, and and have reverted the former changes, no the excetion are managed as str(e) again. It passed all the tests cases.

#10 Updated by intrigeri about 1 month ago

It seems like the problem is not really with the logger, but with the encoding of the exception message, which can't be unicode at least one create a custom Exception that manages that.

I agree we have a problem with inconsistent decoded string vs. bytestring in exception handling and I'm glad you're on it! Your solution looks good to me. I'll test it in non-English locales in various situations where we raise an exception.

Wrt. the logger, I'm still a bit confused: could you please test & confirm that it works as expected regardless of whether we pass it a decoded string or a bytestring, in cases where it matters?

#11 Updated by intrigeri 28 days ago

  • Assignee changed from intrigeri to kurono
  • QA Check changed from Ready for QA to Info Needed

#12 Updated by kurono 6 days ago

intrigeri wrote:

It seems like the problem is not really with the logger, but with the encoding of the exception message, which can't be unicode at least one create a custom Exception that manages that.

I agree we have a problem with inconsistent decoded string vs. bytestring in exception handling and I'm glad you're on it! Your solution looks good to me. I'll test it in non-English locales in various situations where we raise an exception.

Wrt. the logger, I'm still a bit confused: could you please test & confirm that it works as expected regardless of whether we pass it a decoded string or a bytestring, in cases where it matters?

Ok, I think I understand now, our code works as intended with decoded strings and encoding applied to decoded unicode bytestring for instance the following works:

self.log.info(_(u"Suppression du système d\'exploitation live existant"))
self.log.info(_(u"Suppression du système d\'exploitation live existant").encode("utf-8"))

but the following does not:

s = "Suppression du système d\'exploitation live existant" 
b = bytes(s, 'utf-8')
self.log.info(b)

and the normal string with unicode characters does not work:

self.log.info("Suppression du système d\'exploitation live existant")

I will take a look of all the instances when we call the logs to see if we are not doing it right.

Also available in: Atom PDF