Project

General

Profile

Bug #10116

Feature #8539: Make the test suite robust enough to be run as part of a CI setup

"Watching a WebM video" test is fragile

Added by bertagaz almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
Test suite
Target version:
Start date:
08/28/2015
Due date:
% Done:

100%

Feature Branch:
kytv:test/10116-make-webm-video-test-more-robust
Type of work:
Code
Blueprint:
Starter:
No
Affected tool:

Description

I've seen it failing quite often, on my test machine and on Lizard.

It fails at the And I click the blocked video icon step, and the screenshot shows that the browser prints a "Connection timeout" to the https://webm.html5.org/test.webm webpage.

Maybe providing such a webpage locally would help to prevent network troubles, with the same trick used for the "The Unsafe Browser can access the LAN" scenario?


Related issues

Blocks Tails - Feature #9653: Optimize retrying to connect to IRC servers in testsuite using waitAny/existsAny Resolved 06/29/2015

Associated revisions

Revision 26d902fe (diff)
Added by kytv almost 4 years ago

Force new Tor circuit and reload web site on timeout

Will-fix: #10116

Revision c21d3f40
Added by anonym over 3 years ago

Merge remote-tracking branch 'kytv/test/10116-make-webm-video-test-more-robust' into stable

Fix-committed: #10116

Conflicts:
features/step_definitions/pidgin.rb

History

#1 Updated by bertagaz almost 4 years ago

  • Parent task set to #8539

#2 Updated by intrigeri almost 4 years ago

  • Status changed from New to Confirmed
  • Priority changed from Normal to High

I see that sometimes as well. Bumping the timeout helped in some cases, but it doesn't when the remote server is not responding.

#3 Updated by intrigeri almost 4 years ago

bertagaz wrote:

Maybe providing such a webpage locally would help to prevent network troubles, with the same trick used for the "The Unsafe Browser can access the LAN" scenario?

Here we're speaking of accessing a web server over Tor, so running it locally would require running a webserver + tor + a hidden service, and waiting for the latter to show up. IMO it's too complicated. I think we should instead host such data files on our infrastructure, or perhaps in a Git repo at immerda (that test.webm file is merely 5.4K large, so it doesn't seem crazy to me to stuff it directly in our main Git repo). If the latter works (not sure how cgit handles streaming), then let's go for it. Otherwise, please reassign to me and I'll put it somewhere on lizard.

#4 Updated by anonym almost 4 years ago

intrigeri wrote:

Here we're speaking of accessing a web server over Tor, so running it locally would require running a webserver + tor + a hidden service, and waiting for the latter to show up. IMO it's too complicated.

A smooth solution for this situation is what we'll need for #9519 => #9520. In fact, problems like this ticket is the main rationale for that whole full network simulation idea, right? At least how I imagine it, we'll simulate a "full" Internet, i.e. both clearnet end-points (e.g. web servers that Tails wants to reach) and the Tor network, so hidden services will not be a necessary hack to self-host the services we want to test. This is of course even more complicated than the hidden service hack you imagined.

I think we should instead host such data files on our infrastructure, or perhaps in a Git repo at immerda (that test.webm file is merely 5.4K large, so it doesn't seem crazy to me to stuff it directly in our main Git repo). If the latter works (not sure how cgit handles streaming), then let's go for it. Otherwise, please reassign to me and I'll put it somewhere on lizard.

This may be a good, easy temporary solution for now, though. I say we host it in features/misc_files in Tails' Git, and access it through the GitWeb interface in the test.

#5 Updated by bertagaz almost 4 years ago

anonym wrote:

intrigeri wrote:

Here we're speaking of accessing a web server over Tor, so running it locally would require running a webserver + tor + a hidden service, and waiting for the latter to show up. IMO it's too complicated.

Ooch, right!

A smooth solution for this situation is what we'll need for #9519 => #9520. In fact, problems like this ticket is the main rationale for that whole full network simulation idea, right? At least how I imagine it, we'll simulate a "full" Internet, i.e. both clearnet end-points (e.g. web servers that Tails wants to reach) and the Tor network, so hidden services will not be a necessary hack to self-host the services we want to test. This is of course even more complicated than the hidden service hack you imagined.

That'd be nice, but something not really possible right now I guess. :)

I think we should instead host such data files on our infrastructure, or perhaps in a Git repo at immerda (that test.webm file is merely 5.4K large, so it doesn't seem crazy to me to stuff it directly in our main Git repo). If the latter works (not sure how cgit handles streaming), then let's go for it. Otherwise, please reassign to me and I'll put it somewhere on lizard.

This may be a good, easy temporary solution for now, though. I say we host it in features/misc_files in Tails' Git, and access it through the GitWeb interface in the test.

+1

#6 Updated by intrigeri almost 4 years ago

Agreed with both of you.

#7 Updated by kytv almost 4 years ago

  • Assignee changed from kytv to anonym
  • QA Check changed from Dev Needed to Info Needed

Wouldn't this still potentially be a problem if 'we' hosted it, like the whois and wget tests were? It seems to me that this could be similar to the wget and whois (and maybe keyserver) problems in the test suite which were resolved by forcing new Tor circuits and that it's likely not the remote site having problems, but the connection over Tor.

Shall we try using a new Tor circuit then reloading the page? If self-hosting the webm file would be desired we could do that instead (or do that too).

#8 Updated by anonym almost 4 years ago

  • Assignee changed from anonym to kytv
  • QA Check changed from Info Needed to Dev Needed
  • Feature Branch set to test/10116-make-webm-video-test-more-robust

kytv wrote:

Wouldn't this still potentially be a problem if 'we' hosted it, like the whois and wget tests were? It seems to me that this could be similar to the wget and whois (and maybe keyserver) problems in the test suite which were resolved by forcing new Tor circuits and that it's likely not the remote site having problems, but the connection over Tor.

You are probably right!

Shall we try using a new Tor circuit then reloading the page? If self-hosting the webm file would be desired we could do that instead (or do that too).

Yes, let's try that. I've pushed some refactoring work of the "retry with a new Tor circuit" stuff that I hope will help you along the way. Let me know what you think! (Maybe it just complicates the code?) Any way, it's in the test/10116-make-webm-video-test-more-robust branch. Feel free to base your branch of it if you like it!

(Sorry for sneaking in refactoring work in various branches. I'm just extremely worried that the test suite is getting into a bad shape since it's developing so rapidly. I hope this practice is ok! :))

#9 Updated by kytv almost 4 years ago

#10 Updated by kytv almost 4 years ago

  • Status changed from Confirmed to In Progress
  • Assignee changed from kytv to anonym
  • % Done changed from 0 to 40
  • QA Check changed from Dev Needed to Ready for QA
  • Feature Branch changed from test/10116-make-webm-video-test-more-robust to kytv:10116-make-webm-video-test-more-robust

I quite like the changes, cheers. :)

#11 Updated by anonym almost 4 years ago

  • Feature Branch changed from kytv:10116-make-webm-video-test-more-robust to kytv:test/10116-make-webm-video-test-more-robust

#12 Updated by anonym almost 4 years ago

  • Assignee changed from anonym to kytv
  • QA Check changed from Ready for QA to Info Needed

Regarding this part (and the similar instance later in the commit):

     And I open the address "http://www.terrillthompson.com/tests/html5-audio.html" in the Tor Browser
+    And the website finished loading

So, there are a lot more occurrences of I open the address ... step. Have you considered that that step might benefit from the website finished loading too? Maybe they could be merged? Also, wouldn't a retry_tor in the the website finished loading step fix all Tor-related Tor Browser issues in that step? Something like this:

--- a/features/step_definitions/common_steps.rb
+++ b/features/step_definitions/common_steps.rb
@@ -653,6 +653,14 @@ When /^I open the address "([^"]*)" in the (.*)$/ do |address, browser|
   @screen.click(info[:address_bar_image])
   sleep 0.5
   @screen.type(address + Sikuli::Key.ENTER)
+  if browser == "Tor Browser" 
+    recovery_on_failure = Proc.new do
+      reload_webpage('Tor Browser')
+    end
+    retry_tor(recovery_on_failure) do
+      @screen.wait('BrowserReloadButton.png', 120)
+    end
+  end
 end

Warning: not tested, not very well thought through.

However, I wonder if there's a possibility for a race here (with or without my change): isn't it conceivable that Sikuli is so fast at scanning the screen (in fact I think it takes a screenshot at the point it starts scanning the screen) that the browser GUI hasn't updated and removed the reload button yet? Introducing another sleep isn't what I'm looking for... hmm. What do you think?

Next:

+def reload_webpage(browser)
+  @vm.focus_window(browser)
+  STDERR.puts "Reloading web page in the #{browser}" if $config['DEBUG']
+  @screen.wait_and_click('BrowserReloadButton.png', 10)
+  step "the website finished loading" 
+end

So what would happen if the reload button doesn't appear here (which seems likely if the page loading is timing out)? I think this would be better:
def reload_webpage(browser)
  @vm.focus_window(browser)
  STDERR.puts "Reloading web page in the #{browser}" if $config['DEBUG']
  @screen.type(Sikuli::Key.ESC)
  @screen.wait_and_click('BrowserReloadButton.png', 10)
  step "the website finished loading" 
end
If the page is loading, Escape will stop so the reload button appears. If it's not loading, Escape won't do anything.

#13 Updated by kytv almost 4 years ago

anonym wrote:

Regarding this part (and the similar instance later in the commit):

[...]

So, there are a lot more occurrences of I open the address ... step. Have you considered that that step might benefit from the website finished loading too?

I had indeed but wasn't sure if there'd be potential problems in doing so. At this point I don't see how it would be an issue.

Maybe they could be merged? Also, wouldn't a retry_tor in the the website finished loading step fix all Tor-related Tor Browser issues in that step? Something like this:
[...]
Warning: not tested, not very well thought through.

However, I wonder if there's a possibility for a race here (with or without my change): isn't it conceivable that Sikuli is so fast at scanning the screen (in fact I think it takes a screenshot at the point it starts scanning the screen) that the browser GUI hasn't updated and removed the reload button yet? Introducing another sleep isn't what I'm looking for... hmm. What do you think?

Next:
[...]
So what would happen if the reload button doesn't appear here (which seems likely if the page loading is timing out)? I think this would be better:
[...]

That could be an improvement, indeed. Unfortunately, however, it seems that if the loading hasn't progressed to a certain point (how to determine $point I don't know) that after pressing ESC the URL bar will be empty and reloading isn't possible. Maybe we could/should bump the timeout so that we'll see an error in Firefox/TBB with an active/clickable retry button?

Still testing…

#14 Updated by anonym almost 4 years ago

kytv wrote:

anonym wrote:

Next:
[...]
So what would happen if the reload button doesn't appear here (which seems likely if the page loading is timing out)? I think this would be better:
[...]

That could be an improvement, indeed. Unfortunately, however, it seems that if the loading hasn't progressed to a certain point (how to determine $point I don't know) that after pressing ESC the URL bar will be empty and reloading isn't possible. Maybe we could/should bump the timeout so that we'll see an error in Firefox/TBB with an active/clickable retry button?

Well, with the changes I've suggested for the I open the address ... step, then the reload_webpage() code (and its dedicated step) won't be needed anywhere else (now at least), so the reloading can be done in the I open the address ... step where we have the address, so the recovery_on_failure block would be: press Escape, wait for the reload button to disappear, and then redo the "normal" (open address) part of the step:

--- a/features/step_definitions/common_steps.rb
+++ b/features/step_definitions/common_steps.rb
@@ -650,9 +650,22 @@ When /^I open the address "([^"]*)" in the (.*)$/ do |address, browser|
   next if @skip_steps_while_restoring_background
   step "I open a new tab in the #{browser}" 
   info = xul_application_info(browser)
-  @screen.click(info[:address_bar_image])
-  sleep 0.5
-  @screen.type(address + Sikuli::Key.ENTER)
+  open_address = Proc.new do
+    @screen.click(info[:address_bar_image])
+    sleep 0.5
+    @screen.type(address + Sikuli::Key.ENTER)
+  end
+  open_address.call
+  if browser == "Tor Browser" 
+    recovery_on_failure = Proc.new do
+      @screen.type(Sikuli::Key.ESC)
+      @screen.waitVanish('BrowserReloadButton.png', 3)
+      open_address.call
+    end
+    retry_tor(recovery_on_failure) do
+      @screen.wait('BrowserReloadButton.png', 120)
+    end
+  end
 end

Untested!

#15 Updated by intrigeri almost 4 years ago

Wouldn't this still potentially be a problem if 'we' hosted it, like the whois tests. It seems to me that this could be similar to the wget and whois (and maybe keyserver) problems in the test suite which were resolved by forcing new Tor circuits and that it's likely not the remote site having problems, but the connection over Tor.

Shall we try using a new Tor circuit then reloading the page?

Why not. I'd like to make sure that we're not going to hit an already overloaded web server, fighting our way through its queues with other visitors until we get the page we want. Ask its admins if they can take it?

#16 Updated by anonym over 3 years ago

intrigeri wrote:

Wouldn't this still potentially be a problem if 'we' hosted it, like the whois tests. It seems to me that this could be similar to the wget and whois (and maybe keyserver) problems in the test suite which were resolved by forcing new Tor circuits and that it's likely not the remote site having problems, but the connection over Tor.

Shall we try using a new Tor circuit then reloading the page?

Why not. I'd like to make sure that we're not going to hit an already overloaded web server, fighting our way through its queues with other visitors until we get the page we want. Ask its admins if they can take it?

I do not think that's anything to worry about. I mean, were talking about "bursts" of 1 request per minute, possibly times four if all isotesters happen to run the affected scenarios at the same time (right, if some of us are running it, it could be more).

#17 Updated by kytv over 3 years ago

  • Assignee changed from kytv to anonym
  • QA Check changed from Info Needed to Ready for QA

I'm confident that what's in the branch works as intended (and I'm anxiously awaiting retry_tor being useable elsewhere :)

#18 Updated by kytv over 3 years ago

  • Blocks Feature #9653: Optimize retrying to connect to IRC servers in testsuite using waitAny/existsAny added

#19 Updated by kytv over 3 years ago

#20 Updated by anonym over 3 years ago

  • Status changed from In Progress to Fix committed
  • % Done changed from 40 to 100

#21 Updated by anonym over 3 years ago

  • Assignee deleted (anonym)
  • QA Check changed from Ready for QA to Pass

#22 Updated by intrigeri over 3 years ago

#23 Updated by bertagaz over 3 years ago

  • Status changed from Fix committed to Resolved
  • QA Check deleted (Pass)

Also available in: Atom PDF