Project

General

Profile

Feature #14596

Feature #14568: Additional Software Packages

Write automated tests for Additional Software GUI

Added by u over 1 year ago. Updated 2 months ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
Start date:
09/04/2017
Due date:
03/06/2018
% Done:

100%

QA Check:
Pass
Feature Branch:
test/14596-automated-tests-for-ASP-gui
Type of work:
Code
Blueprint:
Starter:
Affected tool:
Additional Software Packages

Description

This needs to happen before March 25th 2018 (B3)

We need:

  1. fixing the "all persistence configuration files have safe access rights" step: live-additional-software.conf now has 644 permissions (on purpose), which prevents any scenario that uses persistence to run, so it was temporarily disabled in 19c6e71f0ea99698536b2c3281ee2852ee538001
  2. fixing the Synaptic scenario, that's broken since b34e8a511733a9dbd8718f93f4f28b05e486cbad
  3. a scenario (described in #14572#note-34 and in the offline ASP blueprint) to test the case where an ASP online package upgrade failed, and how if next boot is offline ASP do cope with it; this will validate the work that's been merged already and shipped in Tails 3.6
  4. tests for the main scenarios described on https://tails.boum.org/blueprint/additional_software_packages/gui/

Related issues

Related to Tails - Bug #15799: "Additional software packages are installed even without network" test always fail in my environment Resolved 08/17/2018
Blocked by Tails - Feature #14576: Write automated tests for Additional Software GUI (Gherkin) Resolved 08/30/2017 02/28/2018

Associated revisions

Revision 19c6e71f (diff)
Added by intrigeri about 1 year ago

Test suite: temporarily disable the "all persistence configuration files have safe access rights" check.

This step will need to be adjusted: live-additional-software.conf now has 644
permissions (refs: #14596)

Revision f2407deb (diff)
Added by intrigeri about 1 year ago

Test suite: hopefully fix the temporarily disabling of the "all persistence configuration files have safe access rights" step (refs: #14596)

Revision d846d6ff (diff)
Added by bertagaz about 1 year ago

Test suite: add step definitions for ASP feature.

refs: #14596

Revision 6bb26dbb (diff)
Added by bertagaz about 1 year ago

Fatorize ASP services starting detection step.

This rely on the ASP upgrade service creating a state file once started,
which is not the case yet, and should be added to ASP code.

Refs: #14596

Revision 26f89d52 (diff)
Added by bertagaz about 1 year ago

Test suite: Temoorary fix bug in APT hooks introduced by ASP.

At the moment, APT waits for the DPKG post install hook notification to
be clicked on before returning, which is a bug that should be fixed with
ticket #15382. Meanwhile we have to run APT install in the background
and check when the package is installed to go on with the feature.

Refs: #14596, #15382

Revision 2c5a0ef8 (diff)
Added by bertagaz about 1 year ago

Test suite: Refine a ASP step. (refs: #14596)

Revision 997cab05 (diff)
Added by bertagaz about 1 year ago

Test suite: ASP notifications on install or upgrade are the same.

Refs: #14596

Revision aecc789e (diff)
Added by bertagaz about 1 year ago

Test suite: Add a step to the second scenario and fill some pending ones.

Refs: #14596

Revision af3f79ee (diff)
Added by bertagaz about 1 year ago

Make ASP upgrade service create a state file after bootup.

Same than the install service. Is usefull to have a robust way to test
if the service started, and to chain things after.

Refs: #14594, #14596

Revision c089f27c (diff)
Added by bertagaz about 1 year ago

Remove unicode character from ASP notification message.

It hits #12185.

Refs: #14594, #14596

Revision dcef9167 (diff)
Added by bertagaz about 1 year ago

Test suite: Add a step to the second scenario and fill some pending ones.

Refs: #14596

Revision 4c0e23c6 (diff)
Added by bertagaz about 1 year ago

Test suite: adapt a step shared by apt and ASP features.

When using ASP to create the persistence, it is first mounted in another
directory for initial configuration.

Refs: #14596

Revision c3ebc4e9 (diff)
Added by bertagaz about 1 year ago

Test suite: Fix name and expand ASP service startup step.

Refs: #14596

Revision 6948fd9c (diff)
Added by bertagaz about 1 year ago

Test suite: Add ASP service status step. (refs: #14596)

Revision 1f33a182 (diff)
Added by bertagaz about 1 year ago

Test suite: Write, factorize and move apt steps in the correct place.

Refs: #14596

Revision 6c5cddaa (diff)
Added by bertagaz about 1 year ago

Test suite: Add step to interact with ASP notifications for package addition/removal.

Refs: #14596

Revision ec702ef7 (diff)
Added by bertagaz about 1 year ago

Test suite: Deactivate ASP documentation step for now.

Refs: #14596

Revision e18aba9a (diff)
Added by bertagaz about 1 year ago

Test suite: Add step for ASP install/upgrade services. (refs: #14596)

Plug them into the Tails VM boot process. Will be usefull to test other
things, e.g. Tails server.

Revision 730af809 (diff)
Added by bertagaz about 1 year ago

Test suite: Let's set the VM non-onion APT sources early in the process.

This is required because of chutney, so let's move that there so that
the APT sources are correctly set everytime.

Refs: #14596

Revision 0ca3692a (diff)
Added by bertagaz about 1 year ago

Test suite: fix "persistent filesystems have safe access rights" step.

Refs: #14596

Revision 6273bbb6 (diff)
Added by bertagaz about 1 year ago

Test suite: Rework ASP notification handling.

The "Installing..." one is almost imposssible to catch in the test
suite, as it disappear fast. So for now its detection is disabled.

Refs: #14596

Revision 67dc4473 (diff)
Added by bertagaz about 1 year ago

Test suite: Drop obsolete ASP step. (refs: #14596)

Revision e3714b36 (diff)
Added by bertagaz about 1 year ago

Test suite: Add a scenario for ASP, refine package hanling in the feature.

Refs: #14596

Revision 3d588395 (diff)
Added by bertagaz about 1 year ago

Test suite: remove workaround for #15430.

Fixed in ASP.

Refs: #14596, #15430

Revision 64906a62 (diff)
Added by bertagaz about 1 year ago

Test suite: untag most ASP scenarios from fragile. (refs: #14596)

Revision 02af0301 (diff)
Added by bertagaz about 1 year ago

Test suite: Fix ASP scenario. (refs: #14596)

Revision e5522991 (diff)
Added by bertagaz about 1 year ago

Test suite: really test how packages using debconf behaves with ASP.

Refs: #14596

Revision 807fa8c0 (diff)
Added by bertagaz about 1 year ago

Test suite: Add missing ASP step. (refs: #14596)

Revision 0eaf4a3a (diff)
Added by bertagaz about 1 year ago

Test suite: fix bad paste leftovers. (refs: #14596)

Revision a1a369cd (diff)
Added by bertagaz about 1 year ago

Test suite: Fix APT feature vs. ASP. (refs: #14596)

  • Remove duplicated scenarios
  • Fix APT sources check scenario
  • Tag ASP scenarios with @check_tor_leaks where necessary

Revision daf6c668 (diff)
Added by bertagaz about 1 year ago

Test suite: Grow delay to wait for ASP notifications. (refs: #14596)

They seem a bit slower to show up when the test suite is run in Jenkins.

Revision 9c8f9e6d (diff)
Added by bertagaz about 1 year ago

Test suite: Fix APT scenario. (refs: #14596)

Revision 2105c305 (diff)
Added by bertagaz about 1 year ago

Test suite: Mark APT step as fragile for now. (refs: #14596)

Revision 41a16eea (diff)
Added by bertagaz about 1 year ago

Test suite: Untag ASP Synaptic step as fragile. (refs: #14596)

In fact we need it to install a package that will be removed in a
subsequent scenario. If it's really too fragile, we can still
s/synaptic/apt/ for a while.

Revision 95cbaa93 (diff)
Added by bertagaz about 1 year ago

Test suite: Move Chutney APT sources modification step.

Plug it after the chutney Tor configuration step (an not inside), so
that it is possible to disable it. This should fix the APT feature.

Refs: #14596

Revision c517e0fd (diff)
Added by bertagaz about 1 year ago

Test suite: Try re-enable ASP notifications step. (refs: #14596)

It was disabled with 6273bbb as lacking robutness. Let see now that the
ASP feature run smoothly how this step behaves.

Revision 768935b3 (diff)
Added by bertagaz about 1 year ago

Test suite: Fill missing ASP upgrade scenario steps. (refs: #14596)

Revision 9ac32e3e (diff)
Added by bertagaz about 1 year ago

Test suite: remove @fragile tag for ASP upgrade scenario. (refs: #14596)

Revision 18c624c1 (diff)
Added by bertagaz about 1 year ago

Test suite: Fill scenario removing package from ASP through its GUI.

Refs: #14596

Revision 076d52e2 (diff)
Added by bertagaz about 1 year ago

Test suite: Add missing documentation step in ASP feature. (refs: #14596)

Catching the doucmentation window in the browser did not work for some
reason, so it uses a screenshot to detect the right page has been
opened.

Revision 9f99bfbc (diff)
Added by bertagaz about 1 year ago

Test suite: Add scenario testing ASP when persistence is locked.

Refs: #14596

Revision 8fbecd68 (diff)
Added by bertagaz 12 months ago

Fix typo. (refs: #14596)

Revision 0fdec7a7 (diff)
Added by bertagaz 12 months ago

Fix bug introduced in commit 9f99bfb. (refs: #14596)

Revision 3761ed0a (diff)
Added by bertagaz 9 months ago

Test suite: take into account that apt(8) won't return when run in the remote shell with the ASP hooks enabled.

This cherry-picks 26f89d52ad5a1a9c937b0b0682116dd5f04c6cea aka. "Test
suite: Temoorary fix bug in APT hooks introduced by ASP" whose original commit
message was:

At the moment, APT waits for the DPKG post install hook notification to
be clicked on before returning, which is a bug that should be fixed with
ticket #15382. Meanwhile we have to run APT install in the background
and check when the package is installed to go on with the feature.

As bertagaz clarified later (https://labs.riseup.net/code/issues/15382#note-12),
for some reason that remains to be investigated, we still need this workaround
despite the ASP's apt-post hook now being non-blocking.

Refs: #14596

Revision 9e1be302 (diff)
Added by bertagaz 9 months ago

Test suite: Remove unecessary step for ASP and factor out a bit.

Refs: #14596

Revision f4463e6e (diff)
Added by bertagaz 9 months ago

Test suite: Fill last pending ASP scenario.

Refs: #14596

Revision 5c418670 (diff)
Added by bertagaz 8 months ago

Test suite: add step definitions for ASP feature.

refs: #14596

Revision 0d01241c (diff)
Added by bertagaz 8 months ago

Fatorize ASP services starting detection step.

This rely on the ASP upgrade service creating a state file once started,
which is not the case yet, and should be added to ASP code.

Refs: #14596

Revision 021eb311 (diff)
Added by bertagaz 8 months ago

Test suite: Refine a ASP step. (refs: #14596)

Revision 60cf315b (diff)
Added by bertagaz 8 months ago

Test suite: ASP notifications on install or upgrade are the same.

Refs: #14596

Revision 520d2a08 (diff)
Added by bertagaz 8 months ago

Test suite: Add a step to the second scenario and fill some pending ones.

Refs: #14596

Revision 24dbd382 (diff)
Added by bertagaz 8 months ago

Test suite: Add a step to the second scenario and fill some pending ones.

Refs: #14596

Revision c79f6585 (diff)
Added by bertagaz 8 months ago

Test suite: adapt a step shared by apt and ASP features.

When using ASP to create the persistence, it is first mounted in another
directory for initial configuration.

Refs: #14596

Revision 885a61a6 (diff)
Added by bertagaz 8 months ago

Test suite: Fix name and expand ASP service startup step.

Refs: #14596

Revision 82c56a94 (diff)
Added by bertagaz 8 months ago

Test suite: Add ASP service status step. (refs: #14596)

Revision 7db56022 (diff)
Added by bertagaz 8 months ago

Test suite: Write, factorize and move apt steps in the correct place.

Refs: #14596

Revision e3375f01 (diff)
Added by bertagaz 8 months ago

Test suite: Add step to interact with ASP notifications for package addition/removal.

Refs: #14596

Revision 2a785ea5 (diff)
Added by bertagaz 8 months ago

Test suite: Deactivate ASP documentation step for now.

Refs: #14596

Revision a4d94e90 (diff)
Added by bertagaz 8 months ago

Test suite: Add step for ASP install/upgrade services. (refs: #14596)

Plug them into the Tails VM boot process. Will be usefull to test other
things, e.g. Tails server.

Revision 49c3535a (diff)
Added by bertagaz 8 months ago

Test suite: Let's set the VM non-onion APT sources early in the process.

This is required because of chutney, so let's move that there so that
the APT sources are correctly set everytime.

Refs: #14596

Revision 98ea1504 (diff)
Added by bertagaz 8 months ago

Test suite: fix "persistent filesystems have safe access rights" step.

Refs: #14596

Revision 05a07555 (diff)
Added by bertagaz 8 months ago

Test suite: Rework ASP notification handling.

The "Installing..." one is almost imposssible to catch in the test
suite, as it disappear fast. So for now its detection is disabled.

Refs: #14596

Revision fe74fb98 (diff)
Added by bertagaz 8 months ago

Test suite: Drop obsolete ASP step. (refs: #14596)

Revision 880bbd07 (diff)
Added by bertagaz 8 months ago

Test suite: Add a scenario for ASP, refine package hanling in the feature.

Refs: #14596

Revision 4eb23799 (diff)
Added by bertagaz 8 months ago

Test suite: remove workaround for #15430.

Fixed in ASP.

Refs: #14596, #15430

Revision fd5deb96 (diff)
Added by bertagaz 8 months ago

Test suite: untag most ASP scenarios from fragile. (refs: #14596)

Revision 2eb8fc12 (diff)
Added by bertagaz 8 months ago

Test suite: Fix ASP scenario. (refs: #14596)

Revision cbb3c91d (diff)
Added by bertagaz 8 months ago

Test suite: really test how packages using debconf behaves with ASP.

Refs: #14596

Revision 9ccccffe (diff)
Added by bertagaz 8 months ago

Test suite: Add missing ASP step. (refs: #14596)

Revision 691aa9e5 (diff)
Added by bertagaz 8 months ago

Test suite: fix bad paste leftovers. (refs: #14596)

Revision 1021c2f1 (diff)
Added by bertagaz 8 months ago

Test suite: Fix APT feature vs. ASP. (refs: #14596)

  • Remove duplicated scenarios
  • Fix APT sources check scenario
  • Tag ASP scenarios with @check_tor_leaks where necessary

Revision e13a48f3 (diff)
Added by bertagaz 8 months ago

Test suite: Grow delay to wait for ASP notifications. (refs: #14596)

They seem a bit slower to show up when the test suite is run in Jenkins.

Revision d04c36b9 (diff)
Added by bertagaz 8 months ago

Test suite: Fix APT scenario. (refs: #14596)

Revision 32f996dd (diff)
Added by bertagaz 8 months ago

Test suite: Mark APT step as fragile for now. (refs: #14596)

Revision 8ec1fae8 (diff)
Added by bertagaz 8 months ago

Test suite: Untag ASP Synaptic step as fragile. (refs: #14596)

In fact we need it to install a package that will be removed in a
subsequent scenario. If it's really too fragile, we can still
s/synaptic/apt/ for a while.

Revision bc7b07f2 (diff)
Added by bertagaz 8 months ago

Test suite: Move Chutney APT sources modification step.

Plug it after the chutney Tor configuration step (an not inside), so
that it is possible to disable it. This should fix the APT feature.

Refs: #14596

Revision 7c3d08f4 (diff)
Added by bertagaz 8 months ago

Test suite: Fill missing ASP upgrade scenario steps. (refs: #14596)

Revision ee49ae96 (diff)
Added by bertagaz 8 months ago

Test suite: remove @fragile tag for ASP upgrade scenario. (refs: #14596)

Revision a5c4bef3 (diff)
Added by bertagaz 8 months ago

Test suite: Fill scenario removing package from ASP through its GUI.

Refs: #14596

Revision d2a67e7f (diff)
Added by bertagaz 8 months ago

Test suite: Add missing documentation step in ASP feature. (refs: #14596)

Catching the doucmentation window in the browser did not work for some
reason, so it uses a screenshot to detect the right page has been
opened.

Revision 408e6d96 (diff)
Added by bertagaz 8 months ago

Test suite: Add scenario testing ASP when persistence is locked.

Refs: #14596

Revision ad863afe (diff)
Added by bertagaz 8 months ago

Fix typo. (refs: #14596)

Revision 004f3952 (diff)
Added by bertagaz 8 months ago

Fix bug introduced in commit 9f99bfb. (refs: #14596)

Revision 12fc3185 (diff)
Added by bertagaz 8 months ago

Test suite: Remove unecessary step for ASP and factor out a bit.

Refs: #14596

Revision a627c1eb (diff)
Added by bertagaz 8 months ago

Test suite: Fill last pending ASP scenario.

Refs: #14596

Revision f5e0a3bb (diff)
Added by bertagaz 8 months ago

Test suite: Fix ASP config application name in dogtail step.

Refs: #14596

Revision bf98379f (diff)
Added by bertagaz 8 months ago

Test suite: Raise waiting time for the ASP installation service startup.

It seems it takes a bit more time in Jenkins.

Refs: #14596

Revision 53f303a8 (diff)
Added by bertagaz 7 months ago

Test suite: Shorten the ASP feature running time a bit.

Let's just not check each time if the package is installed or not after
rebooting, checking this once is enough so that there's no nees to reboot Tails
in each scenarios.

Refs: #14596

Revision c35caf84 (diff)
Added by bertagaz 7 months ago

Test suite: Merge the two upgrade scenarios in ASP feature.

This make us gain a bit more on the running time.

Refs: #14596

Revision 2a8489e5 (diff)
Added by bertagaz 7 months ago

Test suite: try to workaround race condition in the last scenario.

Trying to find the ASP installation failure notification seems to
collide with the "all notifications have disappeared" step.

Refs: #14596

Revision 0686e3b6 (diff)
Added by bertagaz 7 months ago

Test suite: Raise waiting time for the ASP failure notification to appear.

As we don't have the "all notifications have disappeared" step in this
case, it takes a bit of time before the desktop is fully started and ASP
has done so too.

Refs: #14596

Revision d8d89685 (diff)
Added by bertagaz 7 months ago

Test suite: try with a missing step in last ASP scenario.

Refs: #14596

Revision d86d50ef (diff)
Added by bertagaz 7 months ago

Test suite: Use systemd to test if ASP services have started.

Refs: #14596

Revision f44c0aca (diff)
Added by bertagaz 7 months ago

Test suite: Turn a reused variable into a constant.

Refs: #14596

Revision 710d832f (diff)
Added by bertagaz 7 months ago

Test suite: Turn some steps definition variables into non-capturing regexp.

Refs: #14596

Revision e7197f13 (diff)
Added by bertagaz 7 months ago

Test suite: Simplify the way we test ASP is correctly configured for a package.

Refs: #14596

Revision 49e93097 (diff)
Added by bertagaz 7 months ago

Test suite: Add png extension in ASP feature step definition.

Refs: #14596

Revision fc0675cd (diff)
Added by bertagaz 7 months ago

Test suite: Remove unnecessary try_for in ASP step.

Refs: #14596

Revision d12616a2 (diff)
Added by bertagaz 7 months ago

Test suite: Reword in a better english the persistence wizard exit step.

Refs: #14596

Revision dbda17d5 (diff)
Added by bertagaz 7 months ago

Test suite: Turn a the persistence wizard closure step into a function.

Refs: #14596

Revision c1b7b600 (diff)
Added by bertagaz 7 months ago

Test suite: Refine the step preparing a failing ASP upgrade.

Refs: #14596

Revision fa86d418 (diff)
Added by bertagaz 7 months ago

Test suite: Remove unnecessary '-f' to rm in an ASP step.

Refs: #14596

Revision f66d4efc (diff)
Added by bertagaz 7 months ago

Test suite: Split back the APT install/uninstall steps.

Refs: #14596

Revision 0a9832b1 (diff)
Added by bertagaz 7 months ago

Test suite: Shorten a waiting time in ASP feature.

Refs: #14596

Revision a13acb50 (diff)
Added by bertagaz 7 months ago

Test suite: Use failure?

Refs: #14596

Revision 343c32bf (diff)
Added by bertagaz 7 months ago

Test suite: Add missing character '/' in a regexp.

Refs: #14596

Revision 595f9a26 (diff)
Added by bertagaz 7 months ago

Test suite: Nitpicking variable name.

Refs: #14596

Revision b227c11f (diff)
Added by bertagaz 7 months ago

Test suite: Fix forgotten renames steps.

Refs: #14596

Revision c0b961d7 (diff)
Added by bertagaz 7 months ago

Test suite: Fix another forgotten renamed step.

Refs: #14596

Revision 6dcf7e92 (diff)
Added by bertagaz 7 months ago

Test suite: Fix again renamed ASP step.

Refs: #14596

Revision 2d747ceb (diff)
Added by bertagaz 7 months ago

Test suite: Fix file_empty? semantic.

Refs: #14596

Revision 3b21e014 (diff)
Added by bertagaz 7 months ago

Test suite: Another step renaming fix.

Getting tired...

Refs: #14596

Revision 28e7f410 (diff)
Added by bertagaz 7 months ago

Test suite: Make the GUI removal step in ASP feature more coherent.

It does not need to check for the removal of the package in ASP
configuration, given the next step already does it.

Refs: #14596

Revision 22722ad0 (diff)
Added by bertagaz 7 months ago

Test suite: Remove unnecessary duplicates steps.

Refs: #14596

Revision f8dce38c (diff)
Added by bertagaz 7 months ago

Test suite: Fix non-caprturing regexp handling in some steps.

Bug introduced with 710d832fddfe42b79cafa8c65d12b115a0cdb0bd

Refs: #14596

Revision 4a3ed3b5 (diff)
Added by bertagaz 7 months ago

Test suite: Fix condition on empty file in ASP step.

Now that file_empty? errors out if file does not exist, we have to check
for its presence now.

Refs: #14596

Revision 11664c6f (diff)
Added by bertagaz 7 months ago

Test suite: Fix typo

Refs: #14596

Revision a7cfb4be (diff)
Added by bertagaz 7 months ago

Test suite: Be more precise in a step name.

We're login in this step, so its name should state so.

refs: #14596

Revision 22940a11 (diff)
Added by bertagaz 7 months ago

Test suite: Few fixes on some tests and a forgotten renamed step.

Refs: #14596

Revision d9df8feb (diff)
Added by bertagaz 7 months ago

Test suite: fix race condition.

The VM is shut down too fast before the cowsay package is really added
to the ASP list, so let's check for it before the reboot.

Refs: #14596

Revision da17cf9b (diff)
Added by bertagaz 7 months ago

Test suite: run dpkg --compare-versions on the host.

Refs: #14596

Revision e71564ca (diff)
Added by bertagaz 7 months ago

Test suite: Fix the coherence of the way the "I open" steps are written.

Refs: #14596

Revision e1442814 (diff)
Added by bertagaz 6 months ago

Test suite: fix typo.

Refs: #14596

Revision 73aec390 (diff)
Added by bertagaz 6 months ago

Test suite: Fix ASP failure notification scenario.

Refs: #14596

Revision 39fa3337 (diff)
Added by bertagaz 6 months ago

Test suite: reorder the ASP scenario to lower a bit more the running time.

Refs: #14596

Revision 2b20b5f7 (diff)
Added by bertagaz 6 months ago

Test suite: fix typo in ASP step.

Refs: #14596

Revision 826ac9db (diff)
Added by bertagaz 6 months ago

Test suite: bump timeout for package installation and removal.

Recent tests show that it no takes a bit more time, probably because of
ASP coming into play.

Refs: #14596

Revision 703604b5 (diff)
Added by bertagaz 6 months ago

Test suite: bump timeout for package installation and removal.

Recent tests show that it no takes a bit more time, probably because of
ASP coming into play.

Refs: #14596

Revision d0fc9b52 (diff)
Added by bertagaz 6 months ago

Test suite: Remove broken part of a regexp.

This leads to breaking a step in an ASP scenario: 'When I uninstall
"cowsay" using apt'

Refs: #14596

Revision 52b4730c (diff)
Added by bertagaz 6 months ago

Test suite: try to fix the last ASP scenario:

  • Bump time waiting for the notification.
  • Check ASP is correctly configured after adding the package.

Refs: #14596

Revision cd1c022e (diff)
Added by bertagaz 6 months ago

Revert "Test suite: Remove broken part of a regexp."

This reverts commit d0fc9b520191568956b601335081d2b0c2ad68b8.

Appartently it was not the cause of failures I've seen.

Refs: #14596

Revision 862d1147 (diff)
Added by bertagaz 6 months ago

Test suite: Bump waiting time on package removal.

Last one wad not enough for this step.

Refs: #14596

Revision c77c2dd9 (diff)
Added by bertagaz 6 months ago

Test suite: Really fix the package removal step.

Using dpkg to test if a package was uninstalled was racy and error
prone: once really completely removed, dpkg don't have any reference of
the package in its database anymore. The 'deinstall' dpkg state is only
temporary, so we can not really rely on it to know if a package is
removed. Let's use 'apt-cache policy' rather.

Refs: #14596

Revision efb7d3a8 (diff)
Added by bertagaz 6 months ago

Tests suite: Make the last ASP scenario solid and require on less reboot.

Refs: #14596

Revision 6b4817a2 (diff)
Added by bertagaz 6 months ago

Test suite: last ASP step renaming.

All '/I open/I can open/' issues should be fixed.

Refs: #14596

Revision bd529d6f (diff)
Added by bertagaz 6 months ago

Test suite: Make apt uninstall step more robust.

Refs: #14596

Revision 6fd72b7f (diff)
Added by bertagaz 6 months ago

Test suite: Factorize out package install/removal.

Refs: #14596

Revision a9d91284 (diff)
Added by bertagaz 6 months ago

Test suite: Unplug unnecessary network for the last ASP scenario.

Refs: #14596

Revision 1237235f (diff)
Added by bertagaz 6 months ago

Test suite: Ensure the ASP package files removal is effective before going in on.

In some Jenkins run the cowsay deb file was not really removed and
installation succeeded in the last scenario. Let's check the file are
gone before trying to fail installing them.

Refs: #14596

Revision a4eddd2f (diff)
Added by bertagaz 6 months ago

Test suite: Use more $vm.execute_successfully() where relavant in ASP scenario.

Also remove unnecessary workaround on deb file removal.

Refs: #14596

Revision 87dc0009 (diff)
Added by bertagaz 6 months ago

Test suite: don't use instance variables in ASP tests when not necessary.

Refs: #14596

Revision c1e591ab (diff)
Added by bertagaz 6 months ago

Test suite: rename APT installation/removal check function.

Refs: #14596

Revision 06bc5315 (diff)
Added by bertagaz 6 months ago

Test suite: Add delay between ASP service status check.

This should lower the load a bit.

Refs: #14596

Revision fef292d3 (diff)
Added by bertagaz 6 months ago

Test suite: Remove APT from list of features using temporary snapshots.

Refs: #14596

Revision fc719ae8 (diff)
Added by bertagaz 6 months ago

Test suite: rename APT installation/removal check function.

Refs: #14596

Revision a8476479 (diff)
Added by bertagaz 6 months ago

Test suite: Add delay between ASP service status check.

This should lower the load a bit.

Refs: #14596

Revision d9b84e8f (diff)
Added by bertagaz 6 months ago

Test suite: Remove APT from list of features using temporary snapshots.

Refs: #14596

Revision 19a6c781 (diff)
Added by bertagaz 6 months ago

Test suite: reintroduce check in ASP Debian package removal.

Without it the scenario failed again, somehow the ASP service succeeded
to install the package despite the Debian package files being removed.
This should leverage the kind of race condition we hit there.

Refs: #14596

Revision 8354174f (diff)
Added by bertagaz 6 months ago

Test suite: Refine ASP step name and regexp.

Refs: #14596

Revision c4d5ce6b (diff)
Added by bertagaz 6 months ago

Test suite: add ASP feature in the prioritized features.

Putting it before the usb_* features may help in reducing the memory
footprint on the system.

Refs: #14596

Revision 0e3d4eb4 (diff)
Added by bertagaz 6 months ago

Test suite: Move the ASP feature later in the prioritiezd ones.

It made the usb_install fails because it lacked space where it was.

Refs: #14596

Revision 682ea460 (diff)
Added by intrigeri 4 months ago

Test suite: actually disable tails-additional-software-install.service (refs: #14596)

The system-wide tails-additional-software-install.service is not "enabled" in
Tails: instead it's started by the corresponding user
unit (/usr/lib/systemd/user/tails-additional-software-install.service).
So the previous call to "systemctl disable" was a no-op.
Let's instead disable the user service.

And then there's nothing left for us to (re-)enable: we can as well
just start the system-wide service ourselves.

Revision 967908f2 (diff)
Added by intrigeri 4 months ago

Revert "Test suite: reintroduce check in ASP Debian package removal." (refs: #14596)

This reverts commit 19a6c781ac399c640778f7ed0ddf00b199ff659a.

This workaround was only needed due to a race condition caused by the test suite
bug which the previous commit fixes.

Revision 636af67b (diff)
Added by intrigeri 4 months ago

Test suite: use consistent terminology (refs: #14596)

We have a few other similar steps where we don't mention the "link".
Let's keep this simple.

Revision 3ac90f2e (diff)
Added by intrigeri 4 months ago

Test suite: update outdated picture (refs: #14596)

The phrasing of the GUI has changed in t-p-s.git's commit
41252b861b2d3366a794d8d22808a067f9c1f19e, last April 27.

Since then this test has kept passing only because of "Found fuzzy candidate
picture for ASPPersistenceSetupOptionEnabled with similarity 0.8", which is
actually a bug (fuzzy matching is supposed to be disabled by default in our test
suite but for some reason, at least in some cases, it's enabled when it should
not).

So let's replace this picture with a current one.

Revision fd82498a (diff)
Added by intrigeri 4 months ago

Test suite: remove workaround for race condition that's been fixed (refs: #14596)

It's likely that "Trying to catch the notification at desktop startup is racy"
was only due to us not actually disabling the service we meant to disable, which
was fixed by 682ea4607780b41a8b94ada84d9b709e2d746b6f; so hopefully this
workaround is not needed anymore.

Revision 3c1f05b3 (diff)
Added by intrigeri 4 months ago

Test suite: remove now unused snapshot (refs: #14596)

Commit caad88e0ef073bb56760a272bd38417ee7f35ec0 removed the only user
of this short-lived snapshot.

Revision a56bffcf (diff)
Added by intrigeri 4 months ago

Test suite: add comment to make this scenario's secondary responsibility clear (refs: #14596)

Revision aa488748 (diff)
Added by intrigeri 4 months ago

Test suite: add missing file extension to picture (refs: #14596)

Sikuli does not mind and will find the file even if we forget
to make the extension explicit. But our mechanism to save fuzzy
matching candidates won't add the extension itself, so it would
save a "ASPDocumentationInstallCloning" file as candidate,
instead of the expected "ASPDocumentationInstallCloning.png".

Revision 7b70b3b3 (diff)
Added by intrigeri 4 months ago

Test suite: clarify comment (refs: #14596)

- All snapshots used by this feature are temporary except
"with-network-logged-in" and a couple other snapshots that almost all features
use anyway.

- It's not relevant whether a disk created by this feature can be reused: on the
contrary, here we're trying to prioritize features that use large amounts of
disk space temporarily. So let's not suggest the opposite. Besides, if the
__internal disk is actually reused as-is by following features, we have
a problem: every such feature would have Additional Software enabled.
Hopefully this is not how our QCOW2 snapshots mechanism works (I did not check
yet).

- This mechanism is about disk usage in /tmp/TailsToaster, not about memory
usage. The situation when a tmpfs is mounted on /tmp/TailsToaster is
a particular case, rather than something we can assume to be the default here.

Revision bfebf756 (diff)
Added by intrigeri 4 months ago

Test suite: make scenario title match what it's actually testing (refs: #14596)

This scenario does not actually check whether a notification is displayed
so let's not pretend it does.

Revision ca05a6b8 (diff)
Added by intrigeri 4 months ago

Test suite: fix erroneous inter-scenario dependency doc (refs: #14596)

"I set up Additional Software when installing a package without persistent
partition and the package is installed next time I start Tails" is not
sufficient here: it does not add cowsay to the list of Additional Software.
So let's instead declare a dependency on the only scenario that
leaves cowsay in this list once it completes.

Revision 0c75ce62 (diff)
Added by intrigeri 4 months ago

Test suite: fix comment (refs: #14596)

Currently all following scenarios reuse the "__internal" drive.

Revision 354db2e5 (diff)
Added by intrigeri 4 months ago

Test suite: don't look for an unrelated notification when checking if tails-additional-software-upgrade.service has started (refs: #14596)

This service does not display notifications on success: only
tails-additional-software-install.service does.

Context: I'm trying to wrap my mind around how Additional Software notifications
are handled in the test suite and it confused me a bit that we were looking for
a notification that is unrelated to what this step is testing.

Revision ab29142f (diff)
Added by intrigeri 4 months ago

Test suite: document potential reliability issue (refs: #14596)

I doubt we'll lose this race in practice so I won't block on this. But it's
worth documenting, in order to save some time to whoever debugs potential
reliability problems here in the future.

Revision c8cdbf01 (diff)
Added by intrigeri 4 months ago

Test suite: try to avoid race condition (refs: #14596)

My commit fd82498abd1edf7d67bfcde035abe72152a96fb6 was overly optimistic: the
race condition wrt. catching the "The installation of your additional software
failed" notification was not solely caused by the bug I've fixed
in 682ea4607780b41a8b94ada84d9b709e2d746b6f.

Additionally, sometimes "all notifications have disappeared" (which actively
clears the list of notifications) hides the very notification we're looking for
and want to interact with. Whenever this happens, 'I see the "The installation
of your additional software failed" notification after at most 300 seconds' will
succeed anyway (Dogtail will find notifications even after we've cleared them,
go figure) but thankfully 'I can open the Additional Software log file from the
notification' will fail to interact with an invisible notification (anything
else would be utterly confusing).

To avoid this problem, we need to stop clearing all notifications. But then the
"Warning: virtual machine detected!" notification may hide the one we're looking
for. So let's disable the service that would display this spurious notification.
Given this scenario runs Tails offline, this should be the only notification
displayed at login time, so this blocking it should be a sufficient substitute
for clearing all notifications.

Revision fef1d042 (diff)
Added by intrigeri 4 months ago

Test suite: don't rely on the state of the APT lists to check if a package is installed (refs: #14596)

I've seen cases where "apt-cache policy" would fail, when run by
'the package "cowsay" is not installed', as part of
"I am notified when Additional Software fails to install a package",
with:

E: Malformed Description-md5 line; doesn't have the required length (32 != 29)
'0143a1c3acbdb045e4fcaab0d8657"

This seems wrong and I don't know why this happens, but let's not get
side-tracked: the goal here is to check whether a given package is installed,
not to validate the state of the APT lists. So let's use dpkg instead.

For details about why "dpkg -s" was initially replaced with "apt-cache policy",
see c77c2dd90480f8eb53e0c7861e531574345659ec. The analysis of the problem in
that commit message was correct: "deinstall" means "The package is selected for
deinstallation" and is indeed a transient state. The solution did work until we
face a situation when the higher level tool fails while checking things we're
not interested in.

Revision 3f1f1b0e (diff)
Added by intrigeri 3 months ago

Test suite: update button label to match fixed implementation (refs: #14596, #16110)

Revision 10c9f94e (diff)
Added by intrigeri 3 months ago

Test suite: replace workaround for bug that's been fixed with proper implementation (refs: #14596)

d2a67e7f7bcafdd22186a91f3b076a01797d1107 added the workaround
and documented the bug but no ticket was filed back then. Thankfully
anonym's review highlighted this problem as a probable bug in the code,
which turned out to be the case; finally, the bug (#16475) was fixed.

Revision 8756a16d (diff)
Added by intrigeri 3 months ago

Test suite: add comments about disks life-cycle (refs: #14596)

Revision 740d8b8d (diff)
Added by intrigeri 3 months ago

Test suite: use the same terminology as what we expose to users (refs: #14596)

That's kinda what BDD is about :)

Revision 399f80f2 (diff)
Added by intrigeri 3 months ago

Test suite: fix grammar (refs: #14596)

Revision 886fc3b5 (diff)
Added by intrigeri 3 months ago

Test suite: fix buggy checks (refs: #14596)

They were previously no-ops as we did not act based on their return value.

Revision 79ea3b26 (diff)
Added by intrigeri 3 months ago

Test suite: use $vm.execute vs $vm.execute_successfully consistently (refs: #14596)

… and document a bit a few cases where one might wonder why we're
not using vm.execute_successfully.

Revision c19a71a9 (diff)
Added by intrigeri 3 months ago

Test suite: drop unused variable (refs: #14596)

Revision dd27a920 (diff)
Added by intrigeri 3 months ago

Test suite: make function names clearer and less generic (refs: #14596)

Revision cd787cbf (diff)
Added by intrigeri 3 months ago

Test suite: use here-document for readability (refs: #14596)

Revision 2f61ecf4 (diff)
Added by intrigeri 3 months ago

Test suite: fix expected window title (refs: #14596)

Revision 6abd7626
Added by intrigeri 2 months ago

Merge branch 'test/14596-automated-tests-for-ASP-gui' into stable (refs: #14596)

History

#1 Updated by u over 1 year ago

  • Target version set to 2018

#3 Updated by u over 1 year ago

  • Blocks Feature #14597: Review automated tests for Additional Software GUI added

#4 Updated by u over 1 year ago

  • Blocks Feature #14598: Code review for Additional Software packages GUI added

#5 Updated by u over 1 year ago

  • Description updated (diff)
  • Due date changed from 04/01/2018 to 03/06/2018
  • Target version changed from 2018 to Tails_3.6

#6 Updated by u over 1 year ago

  • Target version changed from Tails_3.6 to Tails_3.7

#7 Updated by bertagaz about 1 year ago

Moved to the ticket description as in #14572#note-36

#8 Updated by bertagaz about 1 year ago

  • Description updated (diff)

#9 Updated by intrigeri about 1 year ago

  • Description updated (diff)

#10 Updated by intrigeri about 1 year ago

  • Blocked by Feature #14576: Write automated tests for Additional Software GUI (Gherkin) added

#11 Updated by u about 1 year ago

We need

  1. these tests to be written (and working plausibly well enough) by March 25;
  2. tests reviewed and refixed by April 15.

#12 Updated by bertagaz about 1 year ago

Thanks to remind me the "collective" pressure during the meeting, that was so much fun. I think I had the point already.

#13 Updated by intrigeri about 1 year ago

  • Description updated (diff)

#14 Updated by intrigeri about 1 year ago

  • Description updated (diff)

#15 Updated by intrigeri about 1 year ago

  • Status changed from Confirmed to In Progress

#16 Updated by bertagaz about 1 year ago

  • % Done changed from 0 to 10
  • Feature Branch set to wip/feature/14596-automated-tests-for-ASP-gui

I will not wait for the full review of #14576, so I've started working on this ticket. I've added the missing steps, andd will start replacing the pending ones with actual code.

I had to workaround #15382, but this is a revertable change.

#17 Updated by bertagaz about 1 year ago

  • % Done changed from 10 to 30
  • Feature Branch changed from wip/feature/14596-automated-tests-for-ASP-gui to feature/14596-automated-tests-for-ASP-gui

bertagaz wrote:

I will not wait for the full review of #14576, so I've started working on this ticket. I've added the missing steps, andd will start replacing the pending ones with actual code.

Pushed a lot more commits. I had to hit hard to get most of the feature to work, hence the delay. Only the ASP upgrading service scenarios are left for now (as well as the ones from #14576#note-13), but that's already usable on #14594 and tests a good extend of the ASP features.

I had to workaround #15382, but this is a revertable change.

The way the test suite execute the APT commands still colapse with the ASP hooks, so I had to leave this change.

Let see what Jenkins thinks about it.

#18 Updated by bertagaz about 1 year ago

  • % Done changed from 30 to 40

bertagaz wrote:

Let see what Jenkins thinks about it.

Ok, after pushing a few fixes the branch runs well in Jenkins too.

I have locally added the upgrade scenario that i'll push later after a bit of cleaning.

Leftovers:

  • Regression test against the fixed bug of failed upgrades
  • Removing a package from ASP through its GUI
  • Opening the documentation from the notification link
  • The first 'Installing...' notification that is hard to catch because it disappear (sometimes fast). I'll add a note on #14594 to discuss about that (should it not disappear, or should I ignore this one?).
  • Scenarios from #14576#note-13

Other than that it's running well actually and already covers a good extent of ASP. I've largely exploded the time available for this ticket on the budget, so I'll unfocus a bit on this task.

#19 Updated by bertagaz about 1 year ago

  • Target version changed from Tails_3.7 to Tails_3.8

#20 Updated by intrigeri 11 months ago

  • Target version changed from Tails_3.8 to Tails_3.9

#21 Updated by u 11 months ago

bertagaz, what should we do with this ticket? Are there still tests that need to be written? Did you push your latest modifications? Does this need review? Please assign the ticket to a reviewer (possibly Alan) if that's the case.

#22 Updated by u 10 months ago

Ping? We plan to release this feature in 10 days… Please let me know what remains to be done here and when you plan to do it. Thanks!

#23 Updated by intrigeri 10 months ago

u wrote:

Does this need review? Please assign the ticket to a reviewer (possibly Alan) if that's the case.

It does need to be reviewed, see #14597 for details (who and when).

#24 Updated by intrigeri 10 months ago

  • Priority changed from Normal to High

(As per #14597#note-13.)

#25 Updated by intrigeri 10 months ago

  • Blocks deleted (Feature #14598: Code review for Additional Software packages GUI)

#26 Updated by intrigeri 9 months ago

FYI, now that the main ASP branch was merged, I had to cherry-pick 26f89d52ad5a1a9c937b0b0682116dd5f04c6cea into devel (now known as 3761ed0a85f209da0551ff02c8566b83744d03e0) because otherwise the I install "cowsay" using apt test step fails. Will push once I've confirmed that's enough to fix the problem I see.

#27 Updated by intrigeri 9 months ago

Will push once I've confirmed that's enough to fix the problem I see.

Done.

Now entering meta territory. In the future, please handle these separately:

  • Changes requires to ensure the pre-existing test suite keeps passing, such as the one I've cherry-picked: these should go into the main topic branch for the feature, so regardless of the timing of the rest of the test suite work, even if that other work is not completed in time for the merge as planned, merging the main topic branch for the feature does not break the test suite.
  • New tests specific to the new features:
    • These should go into their dedicated branch, as you did :)
    • That dedicated branch should follow closely the main topic branch for the feature, to ensure it's actually testing what we want to merge. In the last two months this only happened once, because I did it myself.

#28 Updated by intrigeri 9 months ago

  • Related to Bug #15799: "Additional software packages are installed even without network" test always fail in my environment added

#29 Updated by intrigeri 9 months ago

I've merged the most recent state of the testing branch I could (anything newer would make it FTBFS) and pushed, so that your topic branch is built & tested on Jenkins again.

#30 Updated by intrigeri 9 months ago

  • Target version changed from Tails_3.9 to Tails_3.10.1

#31 Updated by bertagaz 9 months ago

intrigeri wrote:

Will push once I've confirmed that's enough to fix the problem I see.

Done.

Thanks!

Now entering meta territory. In the future, please handle these separately:

  • Changes requires to ensure the pre-existing test suite keeps passing, such as the one I've cherry-picked: these should go into the main topic branch for the feature, so regardless of the timing of the rest of the test suite work, even if that other work is not completed in time for the merge as planned, merging the main topic branch for the feature does not break the test suite.

Right, given both ASP code and test branches were supposed to be merged together I did not pay much attention to that, my bad.

  • New tests specific to the new features:
    • These should go into their dedicated branch, as you did :)
    • That dedicated branch should follow closely the main topic branch for the feature, to ensure it's actually testing what we want to merge. In the last two months this only happened once, because I did it myself.

That's not true, I've updated my branch myself, at least tried to at every Tails release (and sometimes #14594 branch too), but the merge was a bif mess. Then I was MIA for some time.

I've merged the most recent state of the testing branch I could (anything newer would make it FTBFS) and pushed, so that your topic branch is built & tested on Jenkins again.

Thanks again. I guess now that 3.9 is out I can update my branch.

u wrote:

bertagaz, what should we do with this ticket? Are there still tests that need to be written? Did you push your latest modifications? Does this need review? Please assign the ticket to a reviewer (possibly Alan) if that's the case.

There's one last test that I'm currently finishing to test (recovering from upgrade failure). It should be pushed tomorrow. From what I remember from our last meeting with Alan, we agreed that other scenarios discussed in #14576#note-15 were not really necessary. So per #14596#note-18 that means once this last scenario is pushed, the branch will be ready for review. But beware that without a workaround it will hit the bug in #14598#note-62.

#32 Updated by intrigeri 9 months ago

  • That dedicated branch should follow closely the main topic branch for the feature, to ensure it's actually testing what we want to merge. In the last two months this only happened once, because I did it myself.

That's not true, I've updated my branch myself, at least tried to at every Tails release (and sometimes #14594 branch too), but the merge was a bif mess. Then I was MIA for some time.

I suspect you've missed "in the last two months" in my comment. Once you take it into account, then either my claim is correct or I got my Git command wrong, in which case I'm sorry. Anyway, what's done is done and I was explicitly requesting a workflow change for the future :)

#33 Updated by bertagaz 9 months ago

  • Assignee changed from bertagaz to intrigeri
  • % Done changed from 40 to 50
  • QA Check set to Ready for QA

bertagaz wrote:

There's one last test that I'm currently finishing to test (recovering from upgrade failure). It should be pushed tomorrow. From what I remember from our last meeting with Alan, we agreed that other scenarios discussed in #14576#note-15 were not really necessary. So per #14596#note-18 that means once this last scenario is pushed, the branch will be ready for review. But beware that without a workaround it will hit the bug in #14598#note-62.

Done, so I think it's ready for review now.

#34 Updated by intrigeri 8 months ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check deleted (Ready for QA)

The branch FTBFS on Jenkins, presumably because it's based on the devel branch. Please base your branch on stable. Thanks in advance!

#35 Updated by bertagaz 8 months ago

  • Assignee changed from bertagaz to intrigeri
  • QA Check set to Ready for QA
  • Feature Branch changed from feature/14596-automated-tests-for-ASP-gui to feature/14596-automated-tests-for-ASP-gui-on-stable

intrigeri wrote:

The branch FTBFS on Jenkins, presumably because it's based on the devel branch. Please base your branch on stable. Thanks in advance!

Ok pushed a new branch rebased on stable. It has one scenario failing due to the bug I reported on #14598#note-62. If you want to run the feature sucessfuly, just replace "The upgrade of your additional.." by "The installation of your additional..." on line 145 until it's fixed.

#36 Updated by intrigeri 8 months ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Info Needed

Ok pushed a new branch rebased on stable.

Thanks! Any reason to keep the old one?

It has one scenario failing due to the bug I reported on #14598#note-62.

Actually, the same two scenarios failed during the first two runs on Jenkins. Is the second one caused by a bug in the code (as opposed to a test suite bug) too?

#37 Updated by bertagaz 8 months ago

  • Assignee changed from bertagaz to intrigeri

intrigeri wrote:

Ok pushed a new branch rebased on stable.

Thanks! Any reason to keep the old one?

Yes, what I reported on #14598#note-62 did not show up in https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-14596-automated-tests-for-asp-gui-on-stable/3/ so I'd like to see if it depends on running on a branch based on the #14594 branch which is based on devel and has newer commits.

It has one scenario failing due to the bug I reported on #14598#note-62.

Actually, the same two scenarios failed during the first two runs on Jenkins. Is the second one caused by a bug in the code (as opposed to a test suite bug) too?

I've pushed a fix for one of this failure (which I've never seen with the branch based on the newer #14594) and the other one did not show up in the build mentioned above. However, a new one happened in the next build on Jenkins. I'm a bit surprised, first time I see it, and first time it appears in any build in Jenkins. Seems gnome-shell took more time than usual to start. Will have a look.

#38 Updated by intrigeri 8 months ago

  • Assignee changed from intrigeri to bertagaz

However, a new one happened in the next build on Jenkins. I'm a bit surprised, first time I see it, and first time it appears in any build in Jenkins. Seems gnome-shell took more time than usual to start. Will have a look.

Seems that this should remains on your plate for now, then. Please reassign to me for review once Jenkins is happy :)

#39 Updated by bertagaz 8 months ago

intrigeri wrote:

Seems that this should remains on your plate for now, then. Please reassign to me for review once Jenkins is happy :)

Oops sorry, I did not mean to do that.

#40 Updated by bertagaz 8 months ago

Importing here part of #14576#note-29 that were said to better fit in this ticket's comments.

intrigeri wrote:

But wait, I suggest you don't jump on fixing them all immediately: I've just noticed that this branch increases the (non-fragile) test suite runtime by 40% (the ASP feature takes longer than the next 3 features that take most time combined).

Indeed, as I have tried to cover most of the use cases and code base, this feature is now quite long to run, specially since it is not optimized.

I see that you've mentioned this problem — which was smaller back then — on this ticket a few months ago but I don't recall any follow-up discussion on this topic. I'd rather have discussed this collectively months ago during a team meeting, because this would have informed your work (what should be implemented and how) instead of now that you've invested so much time into implementing all the tests already. But what's done is done, here we are today, so let's have this conversation now.

Yeah, it has not been discussed during meetings, and no one took a look at the work in progress. I don't remember there was a meeting at the time this feature has grown so much. Might be that I missed some too.

It's clear to me that this feature cannot justifiably add such a cost (resources, longer feedback loop): IMO 30 minutes would be great; 40 minutes would be good; 50-60 minutes might be acceptable if every test has a well justified cost/benefit; but 1h40 is definitely too much.

Sounds like a more reasonnable timing proposal.

I think we should do this: first, look for what makes it take so much time and for cheap ways to optimize the implementation, without affecting the nature of the tests at this point (currently there's no cheating in place, which is a very good starting point, we'll see if we can stick to it).
If these optimizations cut the cost down to something acceptable, great, we can stop here. Otherwise, for each test: is it worth its current cost? If not, what would be the drawback of taking shortcuts, e.g. merely checking whether the config was updated as expected instead of rebooting? Or shall we remove it entirely or mark it as @too_slow and skip it on Jenkins by default? If you want I can help a bit, e.g. to brainstorm candidate cheap optimizations (once we know what steps take so much time) and to suggest shortcuts (cheating) we could use.

Yes, I think avoiding rebooting in every scenario to check if the package is really installed or removed is a good way to go. Checking that once is probably enough, and then for other scenarios we can just check for the presence or absence of the package in the ASP conf. But from calculations I've made, it only lowers the whole feature run time from 20-25 minutes in Jenkins, which is not enough.

The two scenarios that take the most time to run are the upgrade ones. Now that I think about it, we can probably merge them (after the upgrade failure, check after reboot the old package is still correctly installed, then plug the network and check it's upgraded). That would remove something like 10 minutes from the run time, which would get closer to 60 minutes.

As an indication, here are the (gross) run time numbers I've collected from Jenkins:

  Scenario: I am warned I can not use ASP when I boot Tails from a DVD and install a package
    -> 4mn (no reboot)
  Scenario: I can set up and use ASP when I install a package in a Tails that has no persistent partition
    -> 9mn (reboot needed)
  Scenario: I can install packages in a Tails session with locked down persistence without being annoyed by ASP 
    -> 4mn (no reboot)
  Scenario: Packages I install with Synaptic and add to ASP are automatically installed
    -> 10mn (without rebooting: 6mn)
  Scenario: Packages I uninstall and accept to remove from ASP are not installed anymore
    -> 9mn (without rebooting: 5mn)
  Scenario: Packages I uninstall but don't want to remove from ASP are automatically installed
    -> 9mn (without rebooting: 6mn)
  Scenario: Packages I install but not do not add to ASP are not automatically installed
    -> 9mn (without rebooting: 5mn)
  Scenario: Packages I have installed and added to ASP are upgraded when a network is available
    -> 11mn (reboot needed)
  Scenario: Packages I uninstall through ASP GUI are not installed anymore
    -> 9mn (without rebooting: 5mn)
  Scenario: Recovering in offline mode after ASP previously failed to upgrade a package
    -> 14mn (reboot needed)
  Scenario: I am notified when ASP fails to install a package
    -> 8mn (reboot needed)

-> Actual run time : 96mn
-> Without reboots: 77mn

I'll try with the above proposals to see how it goes reagrding the feature run time.

#41 Updated by bertagaz 8 months ago

  • QA Check changed from Info Needed to Dev Needed

#42 Updated by intrigeri 7 months ago

  • Blocks deleted (Feature #14597: Review automated tests for Additional Software GUI)

#43 Updated by u 7 months ago

What's the status of this ticket and regarding time of running those tests?

#44 Updated by bertagaz 7 months ago

I still need to fix the running time of this feature before passing it to review again. I'll start to do it this week, while working on #14576.

#45 Updated by bertagaz 7 months ago

bertagaz wrote:

Yes, I think avoiding rebooting in every scenario to check if the package is really installed or removed is a good way to go. Checking that once is probably enough, and then for other scenarios we can just check for the presence or absence of the package in the ASP conf. But from calculations I've made, it only lowers the whole feature run time from 20-25 minutes in Jenkins, which is not enough.

I've pushed 53f303a that does just that.

The two scenarios that take the most time to run are the upgrade ones. Now that I think about it, we can probably merge them (after the upgrade failure, check after reboot the old package is still correctly installed, then plug the network and check it's upgraded). That would remove something like 10 minutes from the run time, which would get closer to 60 minutes.

And c35caf8 for that.

For now the build is broken, but once 3.10 is released and tha base branch merged back, we'll how much time we gain with this changes.

#46 Updated by intrigeri 7 months ago

  • Target version changed from Tails_3.10.1 to Tails_3.11

#47 Updated by bertagaz 7 months ago

bertagaz wrote:

For now the build is broken, but once 3.10 is released and tha base branch merged back, we'll how much time we gain with this changes.

After a few builds in Jenkins, with the last changes mentionned above, this feature is now running in a bit more than 60 minutes. Not sure how we can lower it, appart from removing some scenarios.

#48 Updated by bertagaz 7 months ago

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

... And I think regarding this ticket we're good for a first code review. Hold on until I fix the few remaining review items from #14576 though.

#49 Updated by intrigeri 7 months ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check deleted (Ready for QA)

bertagaz wrote:

Hold on until I fix the few remaining review items from #14576 though.

OK. Please reassign to me once I can start then.

#50 Updated by bertagaz 7 months ago

intrigeri wrote:

bertagaz wrote:

Hold on until I fix the few remaining review items from #14576 though.

OK. Please reassign to me once I can start then.

Well, I'm fighting with some memory issues on the isotesters, plus the last scenario being a bit fragile (notification detection at desktop startup is a bit tricky there, especially on isotesters), so you can probably start with #14576 while I fix it in the coming hours.

#51 Updated by intrigeri 7 months ago

  • QA Check set to Dev Needed

(Moved to the right ticket.)

#52 Updated by intrigeri 7 months ago

Follow-ups on the previous review, your fixes and answers: […]

Sorry, wrong ticket. Cleaned up and moved.

#53 Updated by intrigeri 7 months ago

bertagaz wrote:

so you can probably start with #14576 while I fix it in the coming hours.

Done. I'll do a first review of the step definitions tomorrow, as planned. Too bad if it's not fully working yet, I don't want to delay this first review again: if I don't review this tomorrow, you won't be able to fix issues my review may discover in time for the last review in a week, and then I would need to make time for yet another round of review, which I'd rather not to.

#54 Updated by intrigeri 7 months ago

Hi! Here's a first review.

Meta:

  • Yes, this is long but don't worry: the main reason why it's so long is that in most cases, I've spelled out a proposed solution. I've almost picked another approach: instead of writing a loooong review, I push a WIP branch with untested draft commits that implement the proposed solution, the commit messages explaining why I think it's better. That works very well with other team-mates of mine, as a communication tool that provides a much clearer way than a plaintext review to express what I mean. But I decided against it for this time because if picked without talking about it first, this approach has the potential to cause bad feelings like "he's taking over my job, what does that really mean?!". Still, I think this approach would be much more efficient, nicer and less frustrating (e.g. on the Gherkin tickets, on some topics it took a number of roundtrips before we understood each other, and on some of them we're not there yet). It would not take me substantially more time in the simplest cases, and in the worst cases (when we have a hard time understanding each other) it would save both of us quite some time.
  • I only looked at the diff: the branch has 88 non-merge commits on top of current stable, including a lot of buggy attempts and follow-up fixes (that's understandable: for example, when I did similar work on the VeraCrypt test suite, I cleaned up my Git history before pushing every time I got something robust locally, but then making it robust on our CI as well required further fixes, and I preferred not to rewrite already published history; I guess it's the same story here). So if some bits I find unclear are explained in the commit messages, I did not see that, but that does not matter much: as anonym would argue, code should be clear by reading it + its comments, without having to dive into its full Git history.

Regarding the performance issue:

  • Your choice of reboots to skip looks fine to me.
  • The result of merging scenarios is a bit scary but I guess that this optimization makes sense, assuming you've checked that it does save us enough run time to be worth the drawbacks. I doubt we can do better apart of improving the Gherkin in the way I've proposed yesterday.
  • It seems that you skipped "look for what makes it take so much time and for cheap ways to optimize the implementation, without affecting the nature of the tests at this point" and jumped directly to taking shortcuts i.e. skipping reboots. Why not, because it looks like we would have ended up there anyway. But now that we see that the shortcuts are not enough to get the run time down sufficiently, I think it's time to "look for what makes it take so much time and for cheap ways to optimize the implementation", which I did, see next bullet point :)
  • It seems that most scenarios spend more time on their initial setup steps than on doing what they're actually testing, which prompted me to look a bit closer and then I noticed that almost every scenario, after we've added sslh to additional software, takes ~2 minutes waiting for additional software to be installed and upgraded. This seems to be unnecessary. If I understand correctly, this is a side effect from an optimization attempt i.e. avoiding the need to install Tails to USB and set up persistence every time. That's nice but it seems that the common starting point for most scenarios is not the best we could use: it seems that we could rather easily get the feature's run time down by ~15 minutes if we instead used, as a common starting point for most scenarios, what they really need, i.e. a Tails that's started from USB, with persistence enabled, an administration password, but with an empty list of additional software. I don't know whether this shared state could become a snapshot. If you don't want to rework all this stuff in depth, I could live with a cheap hack e.g. remove sslh from the list of additional software after "I set up ASP when installing a package without persistent partition and the package is installed next time I start Tails" is done (this requires a little bit of tweaking & reordering because "Packages I uninstall but refuse to remove from ASP are still automatically installed" really needs sslh to be there; so for example you could run the scenario that adds sshl, then the one that refuses to remove it, ending with a hackish step that really removes it from the list of additional software; and then you'll want to move + adjust the comment about sslh being "configured in ASP"). I'm not sure if this cheap hack is much cheaper than doing it right but I'll let you judge. If you instead pick the cleaner approach, possibly a new checkpoint shall be introduced so we restore a system that's ready.
  • I could find no other obvious optimizations that could be done so if the above suggested optimization is not enough, I think you'll need to explore the other worst case ideas I've mentioned in my initial review in September, starting by checking the cost/benefit of each scenario.

Regarding the implementation:

  • The "service has started" step should be decoupled from low-level implementation details (state files) by directly checking the status of the relevant service, instead of checking a side effect of it being successfully started.
  • In "the Additional Software persistence is correctly configured for package":
    • I think grep #{package} #{asp_conf} does not correctly convey what you mean. Using the --line-regexp and --fixed-strings options will fix that.
    • ls /live/persistence/TailsData_unlocked/apt/cache/ | grep -qs '^#{package}.*\.deb$' could probably be simplified and made more correct this way: ls /live/persistence/TailsData_unlocked/apt/cache/#{package}_*.deb (untested)
    • ls /live/persistence/TailsData_unlocked/apt/lists/ | grep -qs '^.*_Packages$'ls /live/persistence/TailsData_unlocked/apt/lists/*_Packages$
  • asp_conf could be turned into a constant instead of hard-coded twice.
  • (to|from) seems to be unused so use a non-capturing regexp: (?:to|from). Same for (and|with).
  • It feels incorrect that the (refuse|accept) (adding|removing) step checks the resulting config, given its name. Whenever we need to check that config, it should be done explicitly in the scenario's description (which we already do most of the time, if not every time).
  • Reading installed_package.parent.parent.child('Close', roleName: 'push button') and the surrounding lines, I'm not sure what this button is. Is it the "x" button mentioned on https://tails.boum.org/doc/first_steps/additional_software/#index4h1? If yes, I think you found a bug (great!): this button should not be called "Close", which does not make any sense when using a screen reader… or Dogtail ⇒ please report this.
  • The fact that "I remove" checks "is not part of" is not consistent with the Gherkin. Either this step is about the user triggering the change (and then it should not check the result, it's the scenario's responsibility, and it does it already); or this step is about triggering the change and verifying it worked fine (and then the step should be renamed and the explicit test for the result should be removed from the scenario).
  • Please use the heredoc technique (grep EOF) for failing_dpkg_hook so we don't have to fiddle with escaping.
  • In "I prepare the ASP upgrade process to fail":
    • I suspect the -f option passed to rm is not needed and can only cause trouble by hiding problem. Same in I remove the "([^"]*)" deb file.
    • I'm not sure about the test -e #{ASP_STATE_DIR}/installed && construct: do we really want that hook to fail if that file does not exist? If not, instead use test -e #{ASP_STATE_DIR}/installed || true
  • In I remove the "([^"]*)" deb file, I think that #{package}_*.deb would express what you mean more correctly.
  • In "I am warned I can not use ASP when I start Tails from a DVD and install a package":
    • Since "I open the Additional Software documentation from the notification link" tests the result and does not merely trigger the action, s/I open/I can open/. That's the same design issue as in (refuse|accept) (adding|removing) and there are other occurrences of it, e.g. the other "I open" steps, so please take a step back and self-review the full scenario for consistency in this respect (I'm saying "for consistency" because the two possible approaches are valid and we use them both, it's just a matter of picking one and sticking to it).
    • screen.wait('ASPDocumentationInstallCloning': please add the .png extension. I had no clue Sikuli would find the picture even without the extension! In theory it's great but in practice, we sometimes grep for the full file name to check if a picture is still used, which would raise a false positive in this case.
  • In "ASP dpkg hook has been run for package":
    • This step does not check what it claims to: if a notification was displayed, it would not fail.
    • I'm surprised there's no better way to check if a dpkg has been run than to look in the ASP logs.
    • It's unclear why grep #{package} #{asp_logs} is correct: even ignoring for now the fact the regexp is probably too broad, it would be good to clarify, somehow, why this grep will succeed if, and only if, the dpkg hook has been run.
  • try_for(60) { @asp = Dogtail::Application.new( and similar: do we really need a try_for here? Dogtail will retry itself, internally, for a while already. If we do need that workaround for some reason, please explain it in a comment so the next person who reads this code is not left wondering.
  • I see no benefit in having I (un)?install "(.+)" using apt handle both installing and removing: they share no code whatsoever and the if/else structure only makes things more complicated ⇒ please split them back into two different step definitions.
  • In grep -qs '^Status:.*deinstall.*$', between "deinstall" and the final .*, please require blank space or a word boundary.
  • We only support running our test suite on Debian, so dpkg --compare-versions could be run on the host system, no need to $vm.execute_successfully. Also, please add single quotes around the version variables in there.
  • "I start Tails from a freshly installed USB drive with an administration password and the network is plugged" logs in but its title does not say so, which is not consistent with the other similar steps we have.
  • I'm not sure it's worth it to introduce the (and|with) an administration password) clause. It could be the default, no?
  • It feels wrong to introduce a new "I start Tails from a freshly installed USB drive with an administration password and the network is plugged" step: we have plenty of similar checkpoints etc., surely one could be adjusted/expanded to be reused, no?
  • Adding step "the additional software package upgrade service has started" to "I log in to a new session" will make us wait for apt update in most scenarios, in all other features, no? (My concern is that this might be cheap on lizard but not when using a slow or metered Internet connection.)
  • The uninstalled variable name is confusing in the package "([^"]+)" is( not)? installed: we're not testing that it's been uninstalled (in which case dpkg -s would succeed and tell us it's been uninstalled).
  • Please use .failure? instead of ! […] .success?.
  • s/I save and exit from the/I save and exit the/: in English one says "I exit a place"
  • There's no obvious reason why "I save and exit from the Persistence Wizard" should be a step, please turn it into a function.
  • In when /.*live-additional-software.conf$/, missing "/" between .* and live-.
  • The semantics of file_empty? feel wrong: when the file does not exist, this should not return true but rather raise an exception, which would happen, I guess, if you dropped all the code except return file_content(file).empty?.

Non-blockers, only if you feel like polishing:

  • Some timeouts seem vastly exaggerated, e.g. waiting up to 2 minutes for "is not part of Additional Software persistence configuration", which merely removes a line from a text file. Of course, it's only problematic when things go wrong and one has to wait longer than needed between each iteration of the debugging process.
  • We're starting too have a number of $vm.execute("touch […]") all over the place. This calls for a file_touch helper (which would need to take a :user parameter). Same for $vm.execute("rm […]"). BTW it's great that you added file_empty?.
  • Wrt. DEBIAN_PRIORITY=critical apt -y remove: can debconf ask questions on removal? If not, dropping this env var would avoid confusion.

So overall, great job! Except of course the trouble you have with code/Gherkin design and regexps (that we were already aware of), most of the above is about clean ups and fixing small issues that you could probably have done yourself with a good self-review (so perhaps I did my review too early).

#55 Updated by bertagaz 7 months ago

intrigeri wrote:

bertagaz wrote:

so you can probably start with #14576 while I fix it in the coming hours.

Done. I'll do a first review of the step definitions tomorrow, as planned. Too bad if it's not fully working yet, I don't want to delay this first review again: if I don't review this tomorrow, you won't be able to fix issues my review may discover in time for the last review in a week, and then I would need to make time for yet another round of review, which I'd rather not to.

Good, I was going to propose you to do this first review in any case. Most of the feature is ready. I'm just having trouble with the last scenario and how to catch the notification but the rest of the feature is OK.

#56 Updated by intrigeri 7 months ago

Note to myself: last review was at d8d896854c.

#57 Updated by bertagaz 7 months ago

Update: as of now I've done most of the fixes coming from the review.

Here's some comments about some parts of the review:

intrigeri wrote:

It's unclear why grep #{package} #{asp_logs} is correct: even ignoring for now the fact the regexp is probably too broad, it would be good to clarify, somehow, why this grep will succeed if, and only if, the dpkg hook has been run.

Well, if the hook is not run, asp is not started for this package, and so its name is not in the logs.

intrigeri wrote:

In "ASP dpkg hook has been run for package":
This step does not check what it claims to: if a notification was displayed, it would not fail.

Right. Do you suggest to rename this step and remove the notification part of its name? Does that mean it should be removed from the scenario name as well, or we can rely on the fact that if the hook notice the persistence is locked, we know that ASP won't notify and it's enough?

intrigeri wrote:

I'm surprised there's no better way to check if a dpkg has been run than to look in the ASP logs.

I have not really checked, but I don't know of a way to know if a dpkg hook is run, apart from looking at its own logs.

intrigeri wrote:

Adding step "the additional software package upgrade service has started" to "I log in to a new session" will make us wait for apt update in most scenarios, in all other features, no? (My concern is that this might be cheap on lizard but not when using a slow or metered Internet connection.)

Yes it does. OTOH that's also what we're doing for other steps like the upgrader, and this step is now part of the Tails boot process when online. Maybe it's overkill, but it made sense to me to have it plugged there. Also, given there's no package configured (nor ASP) in most scenarios, the impact should be low.

intrigeri wrote:

It feels wrong to introduce a new "I start Tails from a freshly installed USB drive with an administration password and the network is plugged" step: we have plenty of similar checkpoints etc., surely one could be adjusted/expanded to be reused, no?

We don't really have one that fits to be adapted, so we should right a brand new snapshot for just one case. I was not sure we'll need it for something else, so I've chosen not to create it yet.

intrigeri wrote:

I'm not sure it's worth it to introduce the (and|with) an administration password) clause. It could be the default, no?

Well, this step is used elsewhere (in the usb feature for example) without the use of the administration password, so I don"t think we can do that.

intrigeri wrote:

We're starting too have a number of $vm.execute("touch […]") all over the place. This calls for a file_touch helper (which would need to take a :user parameter). Same for $vm.execute("rm […]"). BTW it's great that you added file_empty?.

I won't do that I think given the timeline, but I'll open a ticket about it.

Left to be done:

  • Run time optimization suggested in the review
  • Last scenario is still fragile, it's not easy to catch the notification failure at the start of the desktop session. But there's a simple fix: restart the ASP installation service when the desktop has settled.
  • "x" button wrong "close" label to report
  • take a step back and self-review the full scenario for consistency on the "I open" steps.

Hopefully I should have fixed that before tomorrow noon/start of the afternoon. These are not big code changes, so if you wish to review what was done at that point, feel free to go on.

#58 Updated by intrigeri 6 months ago

Hopefully I should have fixed that before tomorrow noon/start of the afternoon. These are not big code changes, so if you wish to review what was done at that point, feel free to go on.

Thanks for the udpate. I'd rather review something closer to the final version so I'll focus on other work this morning and will come back to it this afternoon (either to review if the ticket is Ready for QA, or otherwise to answer the questions you've asked).

#59 Updated by bertagaz 6 months ago

intrigeri wrote:

Hopefully I should have fixed that before tomorrow noon/start of the afternoon. These are not big code changes, so if you wish to review what was done at that point, feel free to go on.

Thanks for the udpate. I'd rather review something closer to the final version so I'll focus on other work this morning and will come back to it this afternoon (either to review if the ticket is Ready for QA, or otherwise to answer the questions you've asked).

OK, we're past 'start of the afternoon' by far, and I've hit some bugs while implementing the review fixes. At the moment the last scenario is still fragile and I'm not sure why yet, but is the last remaining part. I'm testing by bumping the time to wait for the installation failure notification, but I'll have to leave soon and I've already spent (more than) the day on this. Too bad, seems than less than a week is too short for such huge reviews. I'll resume next week and will hopefully quickly fix that.

#60 Updated by bertagaz 6 months ago

Test runs are broken in Jenkins, so I can't have feedbacks from runs there, but I'm fixing things locally. In case some wondered without asking, I hope to put this ticket RfQA this week.

#61 Updated by bertagaz 6 months ago

bertagaz wrote:

Test runs are broken in Jenkins, so I can't have feedbacks from runs there, but I'm fixing things locally. In case some wondered without asking, I hope to put this ticket RfQA this week.

Tests are now passing locally, waiting for #16123 to see how it goes in Jenkins.

#62 Updated by bertagaz 6 months ago

bertagaz wrote:

bertagaz wrote:

Test runs are broken in Jenkins, so I can't have feedbacks from runs there, but I'm fixing things locally. In case some wondered without asking, I hope to put this ticket RfQA this week.

Tests are now passing locally, waiting for #16123 to see how it goes in Jenkins.

It raised a fragility in the APT uninstallation step, that I've seemed to have fixed. One run show a weird behavior in the last scenario (packge successfully installed once, despite having been removed), I'm testing a fix. That's the last remaining so I should be able to put this ticket in RFqA soon (at least).

Run time is now around 55 minutes. If that's too long, I thought we could merge the 4 scenarios about accepting/refusing to add or remove the package from ASP configuration into one big. That would turn the 4 different Tails boot (one per scenarios) in only one, which should remove an additional 15 minutes of running time.

#63 Updated by bertagaz 6 months ago

bertagaz wrote:

bertagaz wrote:

bertagaz wrote:

Test runs are broken in Jenkins, so I can't have feedbacks from runs there, but I'm fixing things locally. In case some wondered without asking, I hope to put this ticket RfQA this week.

Tests are now passing locally, waiting for #16123 to see how it goes in Jenkins.

It raised a fragility in the APT uninstallation step, that I've seemed to have fixed. One run show a weird behavior in the last scenario (packge successfully installed once, despite having been removed), I'm testing a fix. That's the last remaining so I should be able to put this ticket in RFqA soon (at least).

The last runs in Jenkins seems to confirm the local ones about robustness.

Run time is now around 55 minutes. If that's too long, I thought we could merge the 4 scenarios about accepting/refusing to add or remove the package from ASP configuration into one big. That would turn the 4 different Tails boot (one per scenarios) in only one, which should remove an additional 15 minutes of running time.

I've just done that (merge the ASP list accept/refuse adding/removing steps into one). It shortens the running time in Jenkins down to 45 minutes (see last jenkins build, aborted after the feature to reboot lizard).

That's the longest feature we have, but 1) it covers every use cases for this feature, 2) it replaces most of the APT feature. We're now at the point about the acceptability of the running time of this feature.

Introducing snapshots now will take quite some rewrite in the scenarios inter-dependencies, and also will mostly add one-time-only snapshot banches, reused nowhere else.
Another way to reduce the feature running time is removing scenarios. Only three of them qualify. I'll let the reviewer decide. They add to the whole feature coverage.

Right now I'm waiting for yet another Jenkins reboot to have more stats before setting this feature to RfQA.

#64 Updated by bertagaz 6 months ago

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

bertagaz wrote:

The last runs in Jenkins seems to confirm the local ones about robustness.

They still do. Since first Jenkins build on the last change in the branch.

Runtime is stabilizing at a bit less than 45 minutes.

Right now I'm waiting for yet another Jenkins reboot to have more stats before setting this feature to RfQA.

There are two issues, but they are not really related to the code of this feature:

It seems that the lizard isotesters memory is now getting a bit tight for this feature to run. Some scenarios are quite long (specially the Recovering in offline mode after... one . Build 72 and 73 in Jenkins show test suite processes lacking memory left to be started (dpkg). We have some memory left on lizard we can probably pick a bit to spread on the isotesters.

From time to time, there are FirewallAssertionFailedErrors for IPv6 packets. They don't happen so often, so they are hard to track. Jenkins run 70 exposes such a failure. From Jenkins test history, they seem to appear sparsely only in this feature and in the Recovering in offline mode after... scenario only. According to wireshark they are IPv6 "router sollicitation" and "neighboor sollicitation" packets, using IPv6 multicast addresses (which we can not really know what NIC it was related too). I'm investigating, but I don't think it is related to a problem in Tails, nor in this particular feature, but more likely in the way we filter traffic in the test suite.

So as of now, I consider the ASP feature code by itself to be over on my side. I don't know how to proceed with the last issues above. I'll investigate on the second, maybe we can discuss if we want to bump lizard isotesters memory? In the meantime, do you want to look at the code?

#65 Updated by intrigeri 6 months ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Dev Needed

I've reviewed the git log -p f4eeea3bd04156296f5eabeef203789013fe677d... I'm glad to see lots of cleanups and bugfixes. I have a few comments about the changes I see. I think all of them can be trivially fixed:

  • I encourage you to clean up the history of your new commits between review rounds in the future: things like 2b20b5f7a87d042e4847c68c25571340efabf420 make some future analysis needlessly painful (not mentioning reviewing but that's a one-time cost so well). This can also be an opportunity to fix typos in commit messages.
  • At least some of the newly introduced $vm.execute('systemctl instances should instead use $vm.execute_successfully. Same for a $vm.execute("rm in When /^I remove the.
  • /^\s{2}Installed:\s\(none\)$/.match(state) != nil feels like busywork when you could do /^\s{2}Installed:\s\(none\)$/.match(state)? which is tailored exactly for this purpose.
  • The API for is_installed? is unusual; surprising API leads to bugs in its consumers: in Ruby a predicate (bla?) returns true or false and raise exceptions only in, well, exceptional circumstances, whenever it cannot do its job (returning the correct boolean value) at all. Same for is_not_installed?. Besides, the "wait for X to become true" logic has no business in predicates called this way.
  • The need for 1237235f5af0b51e867107f2f86d21f1d9f3d687 is concerning. We're acting directly on ext4 so we cannot even blame the usual culprit for non-POSIX-compliant filesystem behavior i.e. aufs. So either ext4 has a serious bug here, or there's some confusion on our side. But wait, I see that you committed this just minutes after a9d91284245ed9bcdddd47af19d3301afbf7e104. The issue fixed by that other commit could very well explain the failure mode 1237235f5af0b51e867107f2f86d21f1d9f3d687 describes. So I suspect that 1237235f5af0b51e867107f2f86d21f1d9f3d687 is unnecessary, and actually a no-op, now that a9d91284245ed9bcdddd47af19d3301afbf7e104 fixed the root cause of the problem.

Other issues noticed in other ways:

  • systemctl status tails-additional-software-install.service is quite noisy in the test suite logs. We could give it >/dev/null but that may make debugging harder, so instead I suggest adding :delay => 10: if that service is going to do anything, it's going to take minutes, so checking every one second is a tad too intense. In the very worst case, this change will add 10s to every scenario that has a non-empty live-additional-software.conf, which is negligible in front of the total run time of the feature. And in practice I suspect that not querying the Journal via systemctl every second will make installing packages faster (I mean, 2 minutes to install sslh, one wonders what's going on, no?), which may compensate this.
  • It seems to me that apt.feature needs to be removed from the list of "Features using temporary snapshots".

I've just done that (merge the ASP list accept/refuse adding/removing steps into one). It shortens the running time in Jenkins down to 45 minutes (see last jenkins build, aborted after the feature to reboot lizard).

That's the longest feature we have, but 1) it covers every use cases for this feature, 2) it replaces most of the APT feature. We're now at the point about the acceptability of the running time of this feature.

I'm fine with these 5 scenarios being merged via 918c7a11e5a3a4db07cbdec9a69eafec5ca0fbd2. Truth be told, I was pretty scared by the idea initially when you mentioned this here, but my guts feeling is that the resulting Gherkin is something we can manage given the substantial performance benefits. If this tweak ever makes this code too expensive to maintain, we'll need to reconsider.

As you the run time of this feature is now around 40 min, which is good according to the expectations I've set 2 months ago! So case closed here :)

Introducing snapshots now will take quite some rewrite in the scenarios inter-dependencies,

It's unfortunate that we're stuck like that and realized it too late. I'll need us to reflect about the root cause of this problem (there's been a serious issue wrt. how our team has been following progress and discussing identified issues here), as part of the evaluation process our team lead has started.

and also will mostly add one-time-only snapshot banches, reused nowhere else.

True, but… so what? That's why we have temporary snapshots, no?

#66 Updated by intrigeri 6 months ago

There are two issues, but they are not really related to the code of this feature:

It seems that the lizard isotesters memory is now getting a bit tight for this feature to run. […]
Build 72 and 73 in Jenkins show test suite processes lacking memory left to be started (dpkg).

Indeed, grep ENOMEM test_Tails_ISO_*/builds/2018*/archive/build-artifacts/debug.log gives 32 occurrences, all of them on this branch: avconf, tcpdump, and dpkg. This started in the very first run on this branch a month ago.

Is there an explanation as to why this branch increases the memory requirements of the test suite?

For example, I see that you're allocating some variables as instance variables while they could be local e.g.:

@asp_gui
@gedit

Regardless of resources management, this should be fixed at least to make the code closer to what you mean. But I doubt it fully explains the increased memory consumption on the host. Another possible explanation is that this feature might be the one with the biggest amount of network traffic, which might make the packet sniffer mechanism more memory hungry even though we instruct tcpdump to write to a file.

Once you've figured out if there's an explanation for the increased memory usage, if your code changes require infra changes, then it's part of your job to request them. And coincidentally you're the sysadmin on duty so you can do the magic hat swap trick and solve the problem quite easily :)

From time to time, there are FirewallAssertionFailedErrors for IPv6 packets. They don't happen so often, so they are hard to track. Jenkins run 70 exposes such a failure. From Jenkins test history, they seem to appear sparsely only in this feature and in the Recovering in offline mode after... scenario only. According to wireshark they are IPv6 "router sollicitation" and "neighboor sollicitation" packets, using IPv6 multicast addresses (which we can not really know what NIC it was related too). I'm investigating, but I don't think it is related to a problem in Tails, nor in this particular feature, but more likely in the way we filter traffic in the test suite.

I see this one <OpenStruct mac_saddr="50:54:00:6b:d4:37", mac_daddr="33:33:00:00:00:02", protocol="ipv6", sport=nil, dport=nil, saddr="fe80::5254:ff:fe6b:d437", daddr="ff02::2">. It happened on isotester5 and that source MAC address matches none of that VM's so I don't see where it could have come from except from the Tails under test. So this looks like a potentially serious bug in Tails. You don't particularly have to investigate yourself here; feel free to but in any case, please save the artifacts before they're garbage collected; and if your investigation does not reach a conclusion within a few days, please file a ticket about it (we'll handle this as FT or test suite maintenance).

There's another bug here (either in our CI infra or in the test suite itself): https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_feature-14596-automated-tests-for-asp-gui-on-stable/70/cucumberTestReport/additional-software/recovering-in-offline-mode-after-additional-software-previously-failed-to-upgrade-and-then-succeed-to-upgrade-when-online/ pretends that the affected test passed. Thankfully Jenkins marked the job as failed anyway, but still: please check if that's a test suite bug or a Jenkins plugin one and file a ticket about it.

#67 Updated by intrigeri 6 months ago

Ooops, I forgot to read the discussion that followed my previous review. I'll do it later today.

#68 Updated by intrigeri 6 months ago

bertagaz wrote:

intrigeri wrote:

It's unclear why grep #{package} #{asp_logs} is correct: even ignoring for now the fact the regexp is probably too broad, it would be good to clarify, somehow, why this grep will succeed if, and only if, the dpkg hook has been run.

Well, if the hook is not run, asp is not started for this package, and so its name is not in the logs.

I understand that this implicitly relies on a happy coincidence i.e. the fact that nothing else in that log file matches the name of the packages passed to this step (currently: "makepp" only). Please make it reliable and if that's too hard for some reason, at least document this constraint so that consumers of this step, if they bother looking at its implementation instead of trusting what's written on the box, can take it into account.

intrigeri wrote:

In "ASP dpkg hook has been run for package":
This step does not check what it claims to: if a notification was displayed, it would not fail.

Right. Do you suggest to rename this step and remove the notification part of its name? Does that mean it should be removed from the scenario name as well, or we can rely on the fact that if the hook notice the persistence is locked, we know that ASP won't notify and it's enough?

I've committed to take over reviewing only, so I'd rather do the thinking myself here.

I'm satisfied with the other answers found in the comment I'm replying to. Thank you!

This time I think I'm done with the review of the current state of the branch + follow-ups of my previous review.

#69 Updated by bertagaz 6 months ago

I've fixed what was possible. Only remaining is the last note from intrigeri, and fix the memory issue. It should be done by the end of the day.

#70 Updated by bertagaz 6 months ago

I've fixed intrigeri's last note remark and I'm trying to fix the memory issue. With the infra being worked on and having disruptions as announced today, it may take a bit more time.

#71 Updated by CyrilBrulebois 5 months ago

  • Target version changed from Tails_3.11 to Tails_3.12

#74 Updated by bertagaz 5 months ago

bertagaz wrote:

I'll do that tomorrow.

Done, I've added 400Mo to the /tmp/TailsToaster partitions on all of lizard's isotesters. We'll see in the coming days if that's enough. It should be looking at the test suite error messages.

#75 Updated by bertagaz 5 months ago

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

bertagaz wrote:

bertagaz wrote:

I'll do that tomorrow.

Done, I've added 400Mo to the /tmp/TailsToaster partitions on all of lizard's isotesters. We'll see in the coming days if that's enough. It should be looking at the test suite error messages.

Ok, since then the memory issue is gone and appart from some runs where failures came from other problems, the feature runs fine, so I tink it's ready for the hopefully last review.

#76 Updated by intrigeri 4 months ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Dev Needed

Just two things this time, woohoo! Happy to merge once they've been resolved. For one of those, that was identified during my last review but not fixed yet, I ended up debugging the problem myself, I think I've found the bug and I'm proposing a solution (took me 1.25h, most of it to explain things below, just like I would have done in a commit message); I hope this will be sufficient to address the problem. The other thing is about clarifying a code comment.

As said in my first review (#14596#note-65), I'm concerned by 19a6c781ac399c640778f7ed0ddf00b199ff659a (which reintroduced the problematic code from 1237235f5af0b51e867107f2f86d21f1d9f3d687). I don't know if this hack papers over a bug in these new tests, in the Additional Software code, in ext4, in rm, or somewhere else. But I'm sure it does paper over something. So I looked a bit closer and I found that the code in I disable the Additional Software installation service does not do what you think; it actually does nothing at all. As per the systemctl(1) manpage, "This command will print information about the file system operations (symlink removals) executed". But if you look at the debug log for this feature, you'll see call returned: [0, "", ""], which suggests that nothing was disabled. And indeed, that system-wide service is not "enabled" in Tails: instead it's started by the corresponding user unit (config/chroot_local-includes/usr/lib/systemd/user/tails-additional-software-install.service), which this test leaves as-is, so it'll do its job and start the system-wide tails-additional-software-install.service, regardless of the systemctl disable no-op in the test suite. So, I remove the "cowsay" deb files from the APT cache is racing against the startup of the tails-additional-software-install.service user unit, which is started as part of amnesia's systemd --user session, during the login process. I'm not surprised that it occasionally loses the race. I have no idea why 19a6c781ac399c640778f7ed0ddf00b199ff659a helps it win the race more often in your experience, but whatever, this does not matter anymore. I think the real fix is along the lines of:

  1. pass --global to systemctl disable so it does what you intended
  2. drop the systemctl enable part, that does nothing at all (as the debug log shows)
  3. the start part can remain unchanged, at least at this stage; see below though
  4. remove the confusing code again, rebuild our confidence in ext4 and rm, and be satisfied that things don't fail anymore for reasons nobody is able to explain
  5. look into the "Trying to catch the notification at desktop startup is racy" comment; I think it was racy only because of the bug I've identified today; and once that bug is fixed, likely we can do this:
And I enable persistence
And I remove the "cowsay" deb files from the APT cache
And I log in to a new session
And all notifications have disappeared
Then I see the "The installation of your additional software failed" notification after at most 300 seconds

… and remove all the enable/disable/start code, which seems to have been there only to workaround the aforementioned bug. As a result, we'll test the code path that's run in an actual Tails, instead of a slightly different one, which is always good news :)

Makes sense?

Second thing, about 0e3d4eb476bc60d764bf03a97655ca7c802dff8b + c4d5ce6bfed08460efc0512d1a6860a4ca64caa5, I've been confused by the comment and commit message; I had to spend quite some time reverse-engineering what I think you meant, so that it made sense. So I'm requesting some clarifications so that nobody has to go through this again:

  • prioritized_features is about disk space (/tmp/TailsToaster). On Jenkins, we happen to mount a tmpfs on this directory, but that's not obvious to test suite developers, who may have a hard time understanding your comment about "memory footprint". So if you're really talking about what this code calls "disk space", please say so instead of introducing new terminology that can be confusing in this context.
  • I've spent some time trying to understand how "a disk that can be reused" justifies adding this feature to this list, and failed. Please explain in the comment or rephrase :)
  • What feature is "the later feature below" referring to? Please clarify this in the comment itself.
  • I see the commit message says it "may help". I'll assume that you've actually measured that these two commits, combined, do optimize the peak disk usage.

#77 Updated by intrigeri 4 months ago

  • Assignee changed from bertagaz to u

#81 Updated by u 4 months ago

  • Assignee changed from u to intrigeri
  • Target version changed from Tails_3.12 to Tails_3.13

#87 Updated by intrigeri 4 months ago

  • Feature Branch changed from feature/14596-automated-tests-for-ASP-gui-on-stable to test/14596-automated-tests-for-ASP-gui

#88 Updated by intrigeri 4 months ago

  • Description updated (diff)

(All this was done by bertagaz already :)

#89 Updated by intrigeri 4 months ago

  • % Done changed from 50 to 60

Here's a status update.

I've fixed the issues I had intentified + a few more smaller things.

Once I had removed a workaround (because I could not understand the corresponding explanation so that was the only way for me to understand what problem it was about), I've noticed there were indeed some potential race conditions wrt. notification handling, so I've looked closer. I've noticed that our test suite toolage will find notifications even when they are not currently visible to the user (and even sometimes after they've been cleared). But Dogtail will fail to interact with hidden notifications. So I've reviewed in details all how these new tests handle notifications and:

  • I think that every step that needs to interact with notifications now runs in a context when it's guaranteed no other notification is displayed (which would prevent the one we want to interact with from being visible and usable) and we have not hidden ourselves the notification we want to interact with (except in one case, that I've documented with a comment). Sometimes it is obvious that this constraint is met, sometimes it requires deeper analysis. Sometimes it's obvious that a step has anything to do with notifications, sometimes it's not (e.g. "Tor is ready" runs "the Additional Software upgrade service has started" which itself runs "I am notified that the installation succeeded", i.e. de facto, "Tor is ready" checks that the "Additional software installed successfully" notification is displayed; not quite obvious). I've found all this hard to debug and reason about but fixing these semantics would need a major revamping of the very foundations of this branch, which I'm not ready to do.
  • Some other steps merely check if a notification was "displayed" (as in: Dogtail can find it, even if it never shows up on a video recording). Those ones are fine. It would of course be better to ensure the notification is really shown to the user but this would require a serious rework of how we handle notifications. Given the rest of our test suite has set the expectations like that, it would be unfair to expect people working on this ticket to meet higher expectations. Calling it good enough :)
  • I've replaced the main workaround we had in place wrt. notification handling with something I find easier to reason about (and the workaround itself was buggy, which suggests it was not easy to get right for others either). As a side effect, we now test a codepath closer to what happens in production.

The release process is ongoing so now is not a good time to monopolize Jenkins. Once 3.12 is out and I'm back from holiday, I'll have this test suite run enough times on Jenkins. Then I'll fix robustness issues, if any, rinse and repeat. Finally, once I've grown confident the tests are robust, I'll submit this for QA (I'm hoping anonym can review the diff, he might spot issues I've not noticed).

#90 Updated by u 4 months ago

Ack, thanks for the report.

#91 Updated by intrigeri 4 months ago

  • Assignee changed from intrigeri to anonym
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:

Once 3.12 is out and I'm back from holiday, I'll have this test suite run enough times on Jenkins. Then I'll fix robustness issues, if any, rinse and repeat. Finally, once I've grown confident the tests are robust, I'll submit this for QA (I'm hoping anonym can review the diff, he might spot issues I've not noticed).

Among 6 test suite runs on our shared Jenkins and 1 on my local Jenkins, features/additional_software_packages.feature passed every time except once that failure seems to be caused by an unrelated problem (the fact Tor Browser opens at all shows that the notification was "displayed" and the link in there successfully clicked).

anonym, could you please review this branch? If not, I'll ask kibi.

Notes to the reviewer:

  • The history of this branch contains all kinds of drafts, non-working attempts etc. (181 non-merge commits on top of devel) so I suggest you look only at the diff, and inspect the commit history only if you hope to find some explanation in a commit message.
  • Similarly, the history of this ticket contains lots of now obsolete discussions. No need to read it in full.
  • Please clock this as SponsorW work.
  • bertagaz did most of the work over the course of the last 11 months. I was recently tasked to complete it in order to cut down the number of review rounds and ensure a branch I find good enough is ready within a somewhat reasonable time frame. I'll deal with blockers so reassign to me if you find any.
  • If you're surprised by how notifications are handled, I am too. But that's a pervasive issue in our test suite, not a new one brought by this branch (see #14596#note-89 for details).

Thanks in advance! :)

#92 Updated by intrigeri 3 months ago

@anonym, I realized a bit too late that perhaps assigning to you with a target version was not enough to make this particularly visible on your radar. So here we go :)

#93 Updated by anonym 3 months ago

intrigeri wrote:

anonym, could you please review this branch?

Absolutely! It's rare that I get the opportunity to review a large chunk of test suite code. :)

anonym, I realized a bit too late that perhaps assigning to you with a target version was not enough to make this particularly visible on your radar. So here we go :)

I love these notifications! (Except that they trigger when quited, so I removed the @ in the above to avoid it... but I can live with that!)

#94 Updated by intrigeri 3 months ago

  • Blocks Bug #16461: Backup persistence.conf before modifying it in t-p-s added

#95 Updated by anonym 3 months ago

  • Assignee changed from anonym to intrigeri
  • % Done changed from 60 to 70
  • QA Check changed from Ready for QA to Dev Needed

Wow, great work everybody! So far I haven't tested this branch (mostly because it is awkward for me due to #15460) but below you'll find my code review. I'll wait with the testing until my concerns are fixed, but I might have a look at this meanwhile:

  # For some reason the below two steps fail. Dogtail can not find the Firefox
  # application.
  #try_for(60) { @torbrowser = Dogtail::Application.new('Firefox') }
  #step '"Install from another Tails" has loaded in the Tor Browser'

... which I find strange.

Code review

+ # This scenario also sets up the "__internal" drive that the following
+ # scenarios will reuse.
+ Scenario: I set up Additional Software when installing a package without persistent partition and the package is installed next time I start Tails
+ Given I start Tails from a freshly installed USB drive with an administration password and the network is plugged and I login

I find it a bit ugly that we use __internal for this purpose in this feature but I get why: otherwise this scenario would be very long, having to "unroll" the 'I start Tails from ...' step and the snapshot it restores. But I almost think it still is worth it because it is just a coincidence (at least in the sense that it wasn't part of the snapshot system's design) that the implementation can handle this. OTOH, it could be a pain to sync this scenario when e.g. the snapshot it "unrolls" changes since this scenario probably want those changes too. What do you think?

Any way, I looked at the commit history and saw e.g. 7b70b3b3 where concerns are raised about this, so let me just reassure us that all is ok: the __internal disk (or any disk, really) will keep its state between scenarios (and features, FWIW) until one of its snapshots are restored. I guess it would be good to document this in this comment and the corresponding one in usb_upgrade.feature.

+ Then I am proposed to create an Additional Software persistence for the "sslh" package

I find the article ("an") before "Additional Software persistence" confusing/superfluous. Shouldn't we just drop it?

+ And I create the Additional Software persistence

Same again about the article ("the"). Also I would prefer "enable" over "create". I also think the name is too generic given that it will only work in a very specific context. I propose: "I accept to enable Additional Software persistence".

+ And the package "cowsay" installed version is "3.03+dfsg2-1" after Additional Software has been started

Nitpick: this doesn't sound grammatically correct. Proposal: the installed version of package "cowsay" is ...

+ # Prevent the "Warning: virtual machine detected!" notification from racing
+ # with the one we'll be interacting with below.
+ And I disable the tails-virt-notify-user.service user unit

Interesting! Maybe we should always do this to reduce notification races/issues, except for a dedicated scenario testing that it is displayed? What do you think? Would be a new ticket, obviously.

+ $vm.execute("ls /live/persistence/TailsData_unlocked/apt/cache/#{package}_*.deb").success?
+ $vm.execute("ls /live/persistence/TailsData_unlocked/apt/lists/*_Packages").success?

These are essentially NOOP:s -- $vm.execute will happily return false without throwing exceptions, and we don't act upon the returned values! Instead of caring about the return value, either use VM::file_exist? and make it support globs, or just use $vm.execute_successfully (and drop .success? since we only care about exceptions).

  1. Tell the upgrade service check step not to run
    $vm.execute("touch #{ASP_STATE_DIR}/doomed_to_fail")

Let's use $vm.execute vs $vm.execute_successfully consistently, i.e. use $vm.execute only when failure is expected/handled.

Actually, I see quite a few similar instances of $vm.execute added/remaining in apt.rb -- let's fix them too (at least the ones added)!

try_for(60) { gedit = Dogtail::Application.new('gedit').child("log [Read-Only] (#{ASP_STATE_DIR}) - gedit", roleName: 'frame') }

Please drop the assignment of gedit since it isn't used.

+def check_for_installation(package)

[...]

+def check_for_removal(package)

Basing the names on "check" is too vague. What about wait_for_... or something more specific?

+ ! $vm.execute("dpkg -s #{package}").success?

Nitpick: I would prefer dropping the ! and use .failure?. Of course, we violate this all over the place so feel free to ignore. :)

+ apt_pref = 'Package: cowsay
+Pin: release o=Tails,a=asp-test-upgrade-cowsay
+Pin-Priority: 999'

IMHO a here-document would look slightly prettier:

  apt_pref = <<-EOF
Package: cowsay
Pin: release o=Tails,a=asp-test-upgrade-cowsay
Pin-Priority: 999
  EOF

#96 Updated by intrigeri 3 months ago

@anonym, thanks a lot for the great code review :)))

I'll try to address the most severe issues you've spotted in time for 3.13, worst case for 3.14. We'll see what we do about the ones with a less interesting cost/benefit (keeping in mind that everyone here is vastly overclocked already, some of us on work they had not planned to do in the first place).

#97 Updated by anonym 3 months ago

intrigeri wrote:

I'll try to address the most severe issues you've spotted in time for 3.13, worst case for 3.14.

Sounds good!

We'll see what we do about the ones with a less interesting cost/benefit (keeping in mind that everyone here is vastly overclocked already, some of us on work they had not planned to do in the first place).

Agreed!

That said, I did investigate the Dogtail issue, and as I suspected it is a real bug: #16475

#98 Updated by anonym 3 months ago

anonym wrote:

That said, I did investigate the Dogtail issue, and as I suspected it is a real bug: #16475

Solved! Let's remember to uncomment the Dogtail code added in d2a67e7f7bcafdd22186a91f3b076a01797d1107 and remove ASPDocumentationInstallCloning.png.

#99 Updated by intrigeri 3 months ago

Solved! Let's remember to uncomment the Dogtail code added in d2a67e7f7bcafdd22186a91f3b076a01797d1107 and remove ASPDocumentationInstallCloning.png.

Right! Done locally.

#100 Updated by intrigeri 3 months ago

Hi!

So far I haven't tested this branch (mostly because it is awkward for me due to #15460)

@anonym, FWIW I've worked around this problem recently by running the test suite in a dedicated Stretch VM and I feel somewhat dumb to have waited for months before I did that: setting this up didn't take less time because I've waited, and the more I've waited, the more I've suffered from my inability to run the test suite easily, and the more time I've wasted on doing it differently.

+ # This scenario also sets up the "__internal" drive that the following
+ # scenarios will reuse.
+ Scenario: I set up Additional Software when installing a package without persistent partition and the package is installed next time I start Tails
+ Given I start Tails from a freshly installed USB drive with an administration password and the network is plugged and I login

I find it a bit ugly that we use __internal for this purpose in this feature but I get why: otherwise this scenario would be very long, having to "unroll" the 'I start Tails from ...' step and the snapshot it restores.

But I almost think it still is worth it because it is just a coincidence (at least in the sense that it wasn't part of the snapshot system's design) that the implementation can handle this. OTOH, it could be a pain to sync this scenario when e.g. the snapshot it "unrolls" changes since this scenario probably want those changes too. What do you think?

This hack is not so much about simplifying scenarios as it is about speed: IIRC this was part of the optimizations needed to get the runtime of these new test cases down to something acceptable. Setting up persistence and fetching APT lists the first time takes a while (especially on slow Internet connections for the latter). I've suggested looking deeper into this (for performance reasons initially) in my own review but the problem was spotted too late in the dev process so this was dismissed.

Any way, I looked at the commit history and saw e.g. 7b70b3b3 where concerns are raised about this, so let me just reassure us that all is ok: the __internal disk (or any disk, really) will keep its state between scenarios (and features, FWIW) until one of its snapshots are restored. I guess it would be good to document this in this comment and the corresponding one in usb_upgrade.feature.

Documented this here and in usb_install.feature.

I'm not sure what you mean with "and the corresponding one in usb_upgrade.feature"; I suspect you were looking at a version of the code that predates the changes done as part of the USB image project, and now that this got merged things are a bit different now. Unless you mean adding a comment about the "old" drive set up by "Scenario: Installing an old version of Tails to a pristine USB drive"?

+ Then I am proposed to create an Additional Software persistence for the "sslh" package

I find the article ("an") before "Additional Software persistence" confusing/superfluous. Shouldn't we just drop it?

Rephrased in 740d8b8d542aa59882a2562d3c2dfac92016af9c.

+ And I create the Additional Software persistence

Same again about the article ("the"). Also I would prefer "enable" over "create". I also think the name is too generic given that it will only work in a very specific context.

Rephrased in 740d8b8d542aa59882a2562d3c2dfac92016af9c.

I propose: "I accept to enable Additional Software persistence".

Hmm, in this step, we do much more than accepting: we also go through the entire persistence wizard.

+ And the package "cowsay" installed version is "3.03+dfsg2-1" after Additional Software has been started

Nitpick: this doesn't sound grammatically correct. Proposal: the installed version of package "cowsay" is ...

Absolutely. I've spotted this earlier but given how the dev + review feedback loop was going, diving this far would have made it unbearable at least for me (and possibly for bertagaz as well). And when I took over I was focused on fixing the blockers. Anyway, it's cheap to fix ⇒ 399f80f2179e81409c3646139fab03f0f76a57ed.

+ # Prevent the "Warning: virtual machine detected!" notification from racing
+ # with the one we'll be interacting with below.
+ And I disable the tails-virt-notify-user.service user unit

Interesting! Maybe we should always do this to reduce notification races/issues, except for a dedicated scenario testing that it is displayed? What do you think? Would be a new ticket, obviously.

I think the whole handling of notifications in our test suite is problematic (#14596#note-89), in particular the fact we generally don't check that a user would actually see the notification we're looking for. At this point I have no idea if applying this hack globally would solve serious problems (I don't think we have any other similar case yet). If someone is going to spend time in this area, I would suggest first taking a step back and checking what the biggest problems are.

+ $vm.execute("ls /live/persistence/TailsData_unlocked/apt/cache/#{package}_*.deb").success?
+ $vm.execute("ls /live/persistence/TailsData_unlocked/apt/lists/*_Packages").success?

These are essentially NOOP:s -- $vm.execute will happily return false without throwing exceptions, and we don't act upon the returned values!

Good catch!

Instead of caring about the return value, either use VM::file_exist? and make it support globs, or just use $vm.execute_successfully (and drop .success? since we only care about exceptions).

I did the cheapest thing: 886fc3b52337e82ead2b8dd26b86c6e6da21a2f5

  1. Tell the upgrade service check step not to run
    $vm.execute("touch #{ASP_STATE_DIR}/doomed_to_fail")

Let's use $vm.execute vs $vm.execute_successfully consistently, i.e. use $vm.execute only when failure is expected/handled.
Actually, I see quite a few similar instances of $vm.execute added/remaining in apt.rb -- let's fix them too (at least the ones added)!

Yep: 79ea3b26777f9f0de457b3b4310fac46e951c3f5

Please drop the assignment of gedit since it isn't used.

c19a71a94a40e5676de2b782295428f99bd5ec4f

+def check_for_installation(package)

[...]

+def check_for_removal(package)

Basing the names on "check" is too vague. What about wait_for_... or something more specific?

Fully agreed ⇒ dd27a9201c1a2e8bd5f9a7cdccc26dea9eaf0cad. (I had given up on this after bertagaz and I misunderstood each other, see "predicate" above.)

IMHO a here-document would look slightly prettier:

Done.

I've not tested any of this yet and won't be able to do so right away, so I've pushed to Jenkins and we'll see.

#101 Updated by intrigeri 3 months ago

  • QA Check changed from Dev Needed to Ready for QA

#102 Updated by intrigeri 3 months ago

  • Assignee changed from intrigeri to anonym

Passed the test suite locally.

#103 Updated by intrigeri 3 months ago

@anonym I should have mentioned you. Please review again (and, who knows, possibly merge :)

#104 Updated by intrigeri 2 months ago

  • Target version changed from Tails_3.13 to Tails_3.14

I've taken the liberty to merge the current branch into stable and devel because:

  • This branch has already been reviewed, at various stages, quite a few times.
  • I've fixed all the issues pointed out by anonym's review.
  • The corresponding feature was release more than 6 months ago and it's way past time we have automated tests for it.

⇒ I figured that the value of having these tests as part of our QA processes outweighs the lack of a final review.

Still, anonym, I would love to see you review my last set of fixes (i.e. the changes in 6bc35d7ef9..2f61ecf41b that happened after your initial review) when you have time :)

#105 Updated by intrigeri 2 months ago

  • Status changed from In Progress to Fix committed

#106 Updated by intrigeri 2 months ago

  • Blocks deleted (Bug #16461: Backup persistence.conf before modifying it in t-p-s)

#107 Updated by intrigeri 2 months ago

  • Status changed from Fix committed to In Progress

#108 Updated by anonym 2 months ago

  • Status changed from In Progress to Fix committed
  • % Done changed from 70 to 100
  • QA Check changed from Ready for QA to Pass

intrigeri wrote:

So far I haven't tested this branch (mostly because it is awkward for me due to #15460)

anonym, FWIW I've worked around this problem recently by running the test suite in a dedicated Stretch VM and I feel somewhat dumb to have waited for months before I did that: setting this up didn't take less time because I've waited, and the more I've waited, the more I've suffered from my inability to run the test suite easily, and the more time I've wasted on doing it differently.

Yeah, that is what I do as well, but the performance is not great. I'll ask you off-list how your nested setup looks, which perhaps could help me.

Still, anonym, I would love to see you review my last set of fixes (i.e. the changes in 6bc35d7ef9..2f61ecf41b that happened after your initial review) when you have time :)

Looks perfect to me!

#109 Updated by anonym 2 months ago

  • Assignee deleted (anonym)

#110 Updated by intrigeri 2 months ago

  • Target version changed from Tails_3.14 to Tails_3.13

#111 Updated by CyrilBrulebois 2 months ago

  • Status changed from Fix committed to Resolved

Also available in: Atom PDF