Project

General

Profile

Bug #15460

Test suite broken with Java 9+

Added by intrigeri over 1 year ago. Updated 10 days ago.

Status:
In Progress
Priority:
High
Assignee:
Category:
Test suite
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
14.00 h
Feature Branch:
Type of work:
Research
Blueprint:
Starter:
Affected tool:

Description

Sikuli is not in Buster/testing/sid anymore, because of at least:

OTOH, as mentioned by anonym on #15953#note-10: "It's worth noting that we don't use much of Sikuli. As long as we have an image matching primitive that returns the coordinates of the match, implementing the various find(), wait() etc we need is pretty easy. And something like xdotool (preferably something that will work in Wayland... if that is even possible?) can do the rest (mouse and keyboard interaction)".

And indeed, for example OpenQA supports Wayland, so even if we stick to our current testing framework, we could possibly steal some ideas and code from there wrt. the image matching primitives and interaction.

ruby-rjb_1.5.5-2_amd64.build (18 KB) lamby, 05/03/2018 03:02 AM

opencv-match.py View (1003 Bytes) anonym, 10/02/2019 11:00 AM


Subtasks

Bug #17293: Install python-opencv and imagemagick on isotestersResolved


Related issues

Blocks Tails - Bug #15953: Make our test suite survive changes in the surrounding environment Confirmed 09/14/2018
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed
Blocks Tails - Feature #15450: Create LUKS2 persistent volumes by default In Progress 03/23/2018
Blocks Tails - Bug #17031: Test suite's otr-bot.py has obsolete dependencies Confirmed
Blocks Tails - Bug #15831: Use qemu-xhci for TailsToaster Confirmed 08/22/2018
Blocks Tails - Bug #17308: Uninstall libsikulixapi-java on isotesters Confirmed

Associated revisions

Revision 8dad6f60 (diff)
Added by anonym 13 days ago

Test suite: add OpenCV module (to replace Sikuli).

Basically it is a wrapper around OpenCV's matchTemplate() which can be
used for image matching, just like Sikuli's find() etc. Since Ruby
doesn't have any (working) OpenCV bindings we resort to calling a
Python script.

Currently it's not used, but stay tuned!

Refs: #15460

Revision 6906d757 (diff)
Added by anonym 13 days ago

Test suite: add screenshot-method based on ImageMagick.

We are about to remove Sikuli, so we need an alternative.

Refs: #15460

Revision e09ee732 (diff)
Added by anonym 12 days ago

Test suite: replace Sikuli with mix of xdotool and OpenCV.

Removed functionality: --retry-find --fuzzy-image-matching

Refs: #15460

Revision dfb21dab (diff)
Added by anonym 12 days ago

Test suite: bump some images.

After bumping just these (objectively outdated) images I can run a
large part of the test suite, indicating that OpenCV and Sikuli
perform very similarly. Yay!

Refs: #15460

Revision e4e3cbed (diff)
Added by anonym 12 days ago

Test suite: add OpenCV module (to replace Sikuli).

Basically it is a wrapper around OpenCV's matchTemplate() which can be
used for image matching, just like Sikuli's find() etc. Since Ruby
doesn't have any (working) OpenCV bindings we resort to calling a
Python script.

Currently it's not used, but stay tuned!

Refs: #15460

Revision 1e74f946 (diff)
Added by anonym 12 days ago

Test suite: add screenshot-method based on ImageMagick.

We are about to remove Sikuli, so we need an alternative.

Refs: #15460

Revision 15c71004 (diff)
Added by anonym 12 days ago

Test suite: replace Sikuli with mix of xdotool and OpenCV.

Removed functionality: --retry-find --fuzzy-image-matching

Refs: #15460

Revision 9d9549d3 (diff)
Added by anonym 12 days ago

Test suite: bump some images.

After bumping just these (objectively outdated) images I can run a
large part of the test suite, indicating that OpenCV and Sikuli
perform very similarly. Yay!

Refs: #15460

Revision 2ddc8b2c (diff)
Added by anonym 4 days ago

Test suite: add OpenCV module (to replace Sikuli).

Basically it is a wrapper around OpenCV's matchTemplate() which can be
used for image matching, just like Sikuli's find() etc. Since Ruby
doesn't have any (working) OpenCV bindings we resort to calling a
Python script.

Currently it's not used, but stay tuned!

Refs: #15460

Revision 50119f8a (diff)
Added by anonym 4 days ago

Test suite: add screenshot-method based on ImageMagick.

We are about to remove Sikuli, so we need an alternative.

Refs: #15460

Revision ecd5e1e5 (diff)
Added by anonym 4 days ago

Test suite: replace Sikuli with mix of xdotool and OpenCV.

Removed functionality: --retry-find --fuzzy-image-matching

Refs: #15460

Revision 3fba9082 (diff)
Added by anonym 4 days ago

Test suite: bump some images.

After bumping just these (objectively outdated) images I can run a
large part of the test suite, indicating that OpenCV and Sikuli
perform very similarly. Yay!

Refs: #15460

Revision 9642cc62 (diff)
Added by anonym 4 days ago

Test suite: add OpenCV module (to replace Sikuli).

Basically it is a wrapper around OpenCV's matchTemplate() which can be
used for image matching, just like Sikuli's find() etc. Since Ruby
doesn't have any (working) OpenCV bindings we resort to calling a
Python script.

Currently it's not used, but stay tuned!

Refs: #15460

Revision f42c969b (diff)
Added by anonym 4 days ago

Test suite: add screenshot-method based on ImageMagick.

We are about to remove Sikuli, so we need an alternative.

Refs: #15460

Revision e85f5263 (diff)
Added by anonym 4 days ago

Test suite: replace Sikuli with mix of xdotool and OpenCV.

Removed functionality: --retry-find --fuzzy-image-matching

Refs: #15460

Revision d2a89281 (diff)
Added by anonym 4 days ago

Test suite: bump some images.

After bumping just these (objectively outdated) images I can run a
large part of the test suite, indicating that OpenCV and Sikuli
perform very similarly. Yay!

Refs: #15460

History

#1 Updated by anonym over 1 year ago

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

intrigeri wrote:

At least:

So this one we have a workaround for (copy-paster from the Debian bug):

    mkdir -p /usr/lib/jvm/jre/lib/amd64/server/
    ln -s /usr/lib/jvm/java-9-openjdk-amd64/lib/server/libjvm.so \
          /usr/lib/jvm/jre/lib/amd64/server/libjvm.so

This one requires packaging SikuliX 1.1.2 for Debian, so using Java 9 is still out of the question...


... so let's just Java 8 for now:

# When Java 8 is dropped from unstable, set to "stable" instead
DIST=unstable
sudo apt install openjdk-8-j{re,dk}{,-headless}/${DIST}
sudo update-java-alternatives -s java-1.8.0-openjdk-amd64

#3 Updated by intrigeri over 1 year ago

Actually, regarding rjb at least: this could be a great opportunity for you (anonym) to get involved a bit in Debian: you're one of the few people who understand anything about this Java+Ruby thing and it might be more pleasurable for you to submit a fix than to find someone else to do it.

#4 Updated by intrigeri over 1 year ago

#5 Updated by intrigeri over 1 year ago

  • Assignee deleted (anonym)

#6 Updated by intrigeri over 1 year ago

  • Assignee set to lamby
  • Target version changed from Tails_3.7 to Tails_3.8

Wrt. rjb Lunar won't mind help and will not have time to work on this any time soon => NMU / team upload welcome.

#7 Updated by intrigeri over 1 year ago

  • Estimated time set to 4.00 h

#8 Updated by lamby over 1 year ago

  • Subject changed from Test suite is broken with Java 9 to ruby-rjb test suite broken with Java 9

I've fixed the FTBFS and testsuite failures here:

https://bugs.debian.org/874146#28

What are the next steps here?

#9 Updated by lamby over 1 year ago

lamby wrote:

I've fixed the FTBFS and testsuite failures here:

https://bugs.debian.org/874146#28

What are the next steps here?

Should I be pushing this fix in Debian for an easier "merge" or shall I upload to Tails? :)

#10 Updated by lamby over 1 year ago

Just let me know :)

#11 Updated by intrigeri over 1 year ago

I've fixed the FTBFS and testsuite failures here:

\o/

What are the next steps here?

Announce your intent to NMU, wait a bit (I'll ask Lunar if he's fine with it) and then upload :)

#12 Updated by intrigeri over 1 year ago

Should I be pushing this fix in Debian for an easier "merge" or shall I upload to Tails? :)

This package is only required for Tails developers who run the test suite on their sid system, so uploading to Debian makes much more sense here that uploading to a Tails-only repo.

#13 Updated by intrigeri over 1 year ago

Just let me know :)

Pro tip: when asking someone's input here, assign to them and set "QA Check" to "Info Needed" :)

#14 Updated by lamby over 1 year ago

Announce your intent to NMU

Thanks. Done here: https://bugs.debian.org/874146#35

Pro tip: when asking someone's input here, assign to them and set "QA Check" to "Info Needed" :)

Thanks! I guess I didn't know whom to assign to really, but I should have assigned to someone at the very least! :p

#15 Updated by intrigeri over 1 year ago

I guess I didn't know whom to assign to really, but I should have assigned to someone at the very least! :p

Generally, for Foundations Team work, unless someone else (most likely anonym) is explicitly your team-mate/mentor, assigning to me is best.

#16 Updated by lamby over 1 year ago

  • Status changed from In Progress to 11
  • Assignee changed from lamby to intrigeri

ruby-rjb 1.5.5-2 uploaded that fixes this :)

#17 Updated by intrigeri over 1 year ago

  • Subject changed from ruby-rjb test suite broken with Java 9 to Test suite broken with Java 9
  • Status changed from 11 to In Progress
  • % Done changed from 10 to 20
  • Type of work changed from Code to Debian

Thanks lamby!

So half of the problem (the part you committed to work on) is presumably fixed.

Now, two things.

First, I'm a bit confused. With Java 9 and this updated ruby-rjb (up-to-date sid) I still see the RuntimeError: Constants DL and Fiddle is not defined. error that we saw during the FTBFS on https://bugs.debian.org/874146:

$ ./run_test_suite --view  --iso ~/ftp/iso/tails/tails-amd64-3.6.2/tails-amd64-3.6.2.iso
Virtual X framebuffer started on display :1
VNC server running on: localhost:5900
Constants DL and Fiddle is not defined. (RuntimeError)
/usr/lib/ruby/vendor_ruby/rjb.rb:79:in `load'
/usr/lib/ruby/vendor_ruby/rjb.rb:79:in `load'
/srv/tails/git/features/support/helpers/sikuli_helper.rb:5:in `<top (required)>'
/usr/lib/ruby/vendor_ruby/cucumber/rb_support/rb_language.rb:96:in `load'
/usr/lib/ruby/vendor_ruby/cucumber/rb_support/rb_language.rb:96:in `load_code_file'
/usr/lib/ruby/vendor_ruby/cucumber/runtime/support_code.rb:142:in `load_file'
/usr/lib/ruby/vendor_ruby/cucumber/runtime/support_code.rb:84:in `block in load_files!'
/usr/lib/ruby/vendor_ruby/cucumber/runtime/support_code.rb:83:in `each'
/usr/lib/ruby/vendor_ruby/cucumber/runtime/support_code.rb:83:in `load_files!'
/usr/lib/ruby/vendor_ruby/cucumber/runtime.rb:253:in `load_step_definitions'
/usr/lib/ruby/vendor_ruby/cucumber/runtime.rb:61:in `run!'
/usr/lib/ruby/vendor_ruby/cucumber/cli/main.rb:32:in `execute!'
/bin/cucumber:7:in `<main>'

And indeed, the autopkgtests fail in just the same way: https://ci.debian.net/data/autopkgtest/testing/amd64/r/ruby-rjb/225136/log.gz. So I wonder what's going on: it seems that lamby's patch fixed the problem for the package build but not at runtime.

But if I export JAVA_HOME=/usr/lib/jvm/java-9-openjdk-amd64 then rjb does load (confirmed by adding a few puts) but then SikuliX aborts the process:

./run_test_suite --view  --iso ~/ftp/iso/tails/tails-amd64-3.6.2/tails-amd64-3.6.2.iso
Virtual X framebuffer started on display :1
VNC server running on: localhost:5900
[error] RunTimeINIT:  *** terminating: Java version must be 1.7 or later!

… which is the symptom of the second problem I'm summing up below.

So it looks like rjb now needs JAVA_HOME to be set, otherwise it fails to load; OK, why not, we can do this (actually we used to). I suspect the autopkgtests should do the same.

Secondly, the remaining problem is that SikuliX 1.1.1 does not support Java 9. Apparently SikuliX 1.1.2 should support it: the Changelog version 1.1.2 --- final per March 10th 2018 reads "most JAVA 9 problems are fixed" => I've reported https://bugs.debian.org/897215. Next step here is to test if SikuliX 1.1.2 fixes this problem and if it does, get it into Debian… which might be non-trivial given the process seems quite involved and upstream has switched to another Git repo again. Let's give the maintainers a couple weeks to answer but if they don't I'll check with lamby whether this bonus task fit into the 4h budget (and if not, how much more is needed).

In passing, I've tried anonym's workaround (#15460#note-1) + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + sudo ln -s /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/server (to cope with the change lamby uploaded), and I see the Constants DL and Fiddle is not defined. (RuntimeError) error. So it looks like our workaround is not working anymore.

#18 Updated by lamby over 1 year ago

Since I fixed the FTBFS, the toolchain has changed so the build is now failing:

https://gist.github.com/lamby/9bd4d19faeaad9eb17aef98a0eb5bd0f/raw (attached as ruby-rjb_1.5.5-2_amd64.build)

I just had a quick poke (looks like an OpenJDK9 vs OpenJDK10 issue wrt http://openjdk.java.net/jeps/313) but I'm going to be hitting my 4h on this issue alas. Go longer? :)

#19 Updated by lamby over 1 year ago

(I've filed the FTBFS as https://bugs.debian.org/897664)

#20 Updated by intrigeri over 1 year ago

  • Subject changed from Test suite broken with Java 9 to Test suite broken with Java 9+
  • QA Check set to Info Needed

#21 Updated by intrigeri over 1 year ago

  • Assignee changed from intrigeri to lamby
  • Estimated time changed from 4.00 h to 6.00 h
  • QA Check deleted (Info Needed)

Can you please fix the ruby-rjb FTBFS in Debian (and submit a PR upstream)? I guess that 2 hours should be enough to replace the single call to javah. BTW, older build logs might be useful: there's been warnings in there for a while telling us that javah was deprecated and suggesting how to replace it :)

#22 Updated by lamby over 1 year ago

  • Assignee changed from lamby to intrigeri

Sure thing!

#23 Updated by lamby over 1 year ago

Fixed in Debian, uploaded as 1.5.5-2

I meant 1.5.5-3 here, sorry

#24 Updated by intrigeri over 1 year ago

Great! So we're good wrt. ruby-rjb. Chris, please tell me how much of the 6 hours you've used and I'll validate your work in our accounting :)

The next steps (SikuliX) are documented on #15460#note-17. Let's wait a couple more weeks (until our next FT meeting) to give the maintainers the chance to address https://bugs.debian.org/897215 and then we'll see.

#25 Updated by lamby over 1 year ago

please tell me how much of the 6 hours you've used and I'll validate your work in our accounting :)

Any objection if we do that in one "batch" with the other tickets to keep our overheads low? :)

#26 Updated by intrigeri over 1 year ago

please tell me how much of the 6 hours you've used and I'll validate your work in our accounting :)

Any objection if we do that in one "batch" with the other tickets to keep our overheads low? :)

OK, let's try!

#27 Updated by lamby over 1 year ago

intrigeri wrote:

The next steps (SikuliX) are documented on #15460#note-17. Let's wait a couple more weeks (until our next FT meeting) to give the maintainers the chance to address https://bugs.debian.org/897215 and then we'll see.

I have:

  • Asked the Java maintainers whether they would like me to update it.

#28 Updated by lamby over 1 year ago

Noting RM bug #897333.

#29 Updated by intrigeri over 1 year ago

WIP SikuliX 1.1.2 packaging (broken at runtime according to the former maintainer): https://salsa.debian.org/java-team/sikuli

#30 Updated by intrigeri over 1 year ago

  • Priority changed from Normal to Elevated

#31 Updated by intrigeri over 1 year ago

  • Assignee changed from intrigeri to lamby
  • QA Check set to Info Needed

Dear Chris,

sorry I forgot to raise this topic at the end of our FT meeting (while I had mentioned it earlier). I'd like us to assess the SikuliX situation so we can make a decision for Tails (and maybe for Debian). To start with I'd like to have an idea of:

  • How far is the current WIP packaging of 1.1.2 from working on testing/sid (according to it "is completely broken at
    runtime") in a X.Org environment? Maybe 1.1.3 fixes some of that?
  • What would the minimal maintenance cost of SikuliX in Debian would look like? I guess that importing upstream 1.1.3 could give an idea of the mess is talking of.
  • What's the deal with Wayland? Note that Tails does not use it yet: #12213. says that he has "tested version 1.1.1 currently in unstable and testing and it is broken as well, probably because of the migration to Wayland". But I'm running GNOME on Wayland on sid and SikuliX 1.1.1 worked just fine to run the Tails test suite before Java 9 became the default; I suspect that mixed up "SikuliX 1.1.1 does not support Java 9" with "Sikuli is broken with Wayland" or similar. But I've not tested Sikuli to exercise an OS that runs Wayland so I dunno, it may very well be that is correct! It would be interesting to know if we're using any part of Sikuli that rely on X.Org and will break on Wayland. I'm not even sure how exactly we wire Sikuli with the VM under test.

Upstream publishes no Git tag so I think I start to understand what Gilles is talking of…

Once we have this info it'll be easier to decide whether we take over maintenance of SikuliX in Debian, or we start maintaining it for Tails only (in case the previous option is too costly/crazy and we can live with a cheap crappy version for Tails), or we research alternatives to Sikuli.

Are you interested to look into this? Do you think you can answer these questions in 8 hours? (I would say just try, report back shortly before running out of time, and how much progress you'll have made in 8h will already teach us something.) IIRC you did not want more work in June, so: can you do this in July? (Rationale: I'd like us to make a decision in time for the Buster freeze :)

#32 Updated by lamby over 1 year ago

  • Assignee changed from lamby to intrigeri

ACK all these difficult questions!

Are you interested to look into this? Do you think you can answer
these questions in 8 hours?

As I'm sure you can understand at this point, I simply can't tell
without jumping into packaging 1.1.3. :) Will indeed just try and
report back if/when running out of time. I should find time for this
in June.

My gut tells me we should look into alternatives given all these
problems we know about.

(pinging bug back and forth for visibility)

#33 Updated by lamby over 1 year ago

  • Assignee changed from intrigeri to lamby

#34 Updated by intrigeri over 1 year ago

  • Estimated time changed from 6.00 h to 14.00 h
  • QA Check deleted (Info Needed)

Great! (updating total budget => 6 + 8 = 14)

#35 Updated by intrigeri over 1 year ago

  • Target version changed from Tails_3.8 to Tails_3.9

#36 Updated by lamby over 1 year ago

During summit we roughly scheduled this as being targetted for Dec/Jan 2018/2019.

#37 Updated by intrigeri over 1 year ago

  • Target version changed from Tails_3.9 to Tails_3.12

lamby wrote:

During summit we roughly scheduled this as being targetted for Dec/Jan 2018/2019.

Right, updating target version accordingly :)

#38 Updated by intrigeri over 1 year ago

  • Blocked by Bug #15953: Make our test suite survive changes in the surrounding environment added

#39 Updated by intrigeri over 1 year ago

  • Blocked by deleted (Bug #15953: Make our test suite survive changes in the surrounding environment)

#40 Updated by intrigeri over 1 year ago

  • Blocks Bug #15953: Make our test suite survive changes in the surrounding environment added

#41 Updated by lamby about 1 year ago

Note that Debian will likely be shipping buster with OpenJDK 11.

#42 Updated by intrigeri 12 months ago

  • Target version changed from Tails_3.12 to Tails_3.13

#43 Updated by lamby 9 months ago

  • Target version changed from Tails_3.13 to Tails_3.14

#44 Updated by intrigeri 9 months ago

#45 Updated by intrigeri 9 months ago

#46 Updated by intrigeri 8 months ago

  • Blocks Feature #15450: Create LUKS2 persistent volumes by default added

#47 Updated by intrigeri 8 months ago

  • Priority changed from Elevated to High
  • Target version changed from Tails_3.14 to 2019

#49 Updated by intrigeri 3 months ago

  • Blocks Bug #17031: Test suite's otr-bot.py has obsolete dependencies added

#50 Updated by intrigeri 3 months ago

  • Blocks Bug #15831: Use qemu-xhci for TailsToaster added

#51 Updated by intrigeri 3 months ago

  • Description updated (diff)

#52 Updated by intrigeri 3 months ago

  • Description updated (diff)

#53 Updated by intrigeri 3 months ago

  • Type of work changed from Debian to Research

lamby wrote:

My gut tells me we should look into alternatives given all these problems we know about.

Agreed, let's put the "make Sikuli work in Debian" on the back burner for now, and instead look into alternatives. I've collected some ideas in the ticket description.

#54 Updated by anonym 2 months ago

A few days ago I stumbled upon some (Python) code that used OpenCV's matchTemplate() which suites perfectly as an "image matching primitive". You get coordinates and a similarity score for the match, so we can still match with different fuzziness. It only finds one match, though, so we cannot do something like Sikuli's findAll() but we are currently not using any of them (although we have before), but I don't think this is a blocker.

Inspired by that code I quickly wrote the attached script for a quick check on how it performs for our images (if you try it, setting the DEBUG env var will show the match in a popup which is handy): I booted Tails, took a screenshot of the desktop and could successfully match the expected images e.g. TorStatusUsable.png and GnomeApplicationsMenu.png. A nice start! For more evaluation I suggest integrating something like this script into our automated test suite so it replicates each match Sikuli attempts and saves the matches (the original image with a red square for the match) so they can be manually inspected easily after the full run is completed.

So, that would let us know if OpenCV's matching can be a drop-in replacement for Sikuli's. Unfortunately, however, there's only an unmaintained Ruby gem for OpenCV (and it only supports exactly OpenCV version 2.4.12 which isn't even in Debian). But now that we have no reason for a Java-to-Ruby bridge, why not introduce a Python-to-Ruby bridge? :D Of course it exists but it is also unmaintained, so I mention this mostly to immediately rule out this approach.

So if we want to use OpenCV I suppose we need to call it as a subprocess. In fact, the script as-is would serve for that purpose. A potential issue is that the script has two heavy dependencies, cv2 and numpy, which take almost 300ms to import on my pretty powerful CPU (i7-7700HQ @ 2.8 GHz) which I worry is so much that it might interfere. So perhaps we make it a "server" that we leave running and communicate with over e.g. a unix socket. It will work, but it is a bit ugly...

#55 Updated by intrigeri 2 months ago

A few days ago I stumbled upon some (Python) code that used OpenCV's matchTemplate() which suites perfectly as an "image matching primitive".

Great news, I'm glad you've been looking into this. It's exciting!

So if we want to use OpenCV I suppose we need to call it as a subprocess. In fact, the script as-is would serve for that purpose. A potential issue is that the script has two heavy dependencies, cv2 and numpy, which take almost 300ms to import on my pretty powerful CPU (i7-7700HQ @ 2.8 GHz) which I worry is so much that it might interfere. So perhaps we make it a "server" that we leave running and communicate with over e.g. a unix socket. It will work, but it is a bit ugly...

Interesting. I think that at this point, I'm less worried about the potential ugliness than about reliability: it's one more thing that manages its own state and can get stuck or desynchronized from the callers. The remote shell historical fragility (now thankfully a thing of the past!) comes to mind here.

Wrt. strategy and allocation of our resources, with my "person who manages the FT's budget" hat on: I'm a bit concerned about the costs that come with a NIH solution. I would like us to first look into how OpenQA does images matching, as suggested in the ticket description; and later, fallback on this brand new approach, solve the problems that come with it, and maintain it, if and only if the code that a number of major distros use does not work for us. For example, I suspect that OpenQA's implementation does is not affected by this load time issue. Now, of course, it may be that we quickly realize that the OpenQA implementation is very much non-reusable and we give up on it after spending 1-2h on it; but at this point we don't know this.

#56 Updated by anonym 2 months ago

intrigeri wrote:

A few days ago I stumbled upon some (Python) code that used OpenCV's matchTemplate() which suites perfectly as an "image matching primitive".

Great news, I'm glad you've been looking into this. It's exciting!

So if we want to use OpenCV I suppose we need to call it as a subprocess. In fact, the script as-is would serve for that purpose. A potential issue is that the script has two heavy dependencies, cv2 and numpy, which take almost 300ms to import on my pretty powerful CPU (i7-7700HQ @ 2.8 GHz) which I worry is so much that it might interfere. So perhaps we make it a "server" that we leave running and communicate with over e.g. a unix socket. It will work, but it is a bit ugly...

Interesting. I think that at this point, I'm less worried about the potential ugliness than about reliability:

Ack. Let's consider this a braindump and forget it for now!

Wrt. strategy and allocation of our resources, with my "person who manages the FT's budget" hat on: I'm a bit concerned about the costs that come with a NIH solution.

Yes, given the history with Sikuli (it was the reason for us initially using JRuby...) this is also my first priority. The thing is, I have (in a very relaxed manner) looked for Sikuli alternatives for years, and OpenCV has been the basically the only answer. Luckily it is a stable project in terms of maintenance, AFAICT! :)

Also, I suspect that the amount of work needed for integrating any matching primitive into our test suite is dwarfed by the work needed for the machinery to around that is shared by all such solutions, e.g. keyboard/mouse stuff, and probably rewriting minimal versions of a few of Sikuli's classes, like Screen and Match, since we are pretty invested in them and they are pretty great abstractions any way. In other words, the NIH-part is small in comparison to the whole work. And it will be compartmentalized so it's little work replacing it with another matching primitive.

I would like us to first look into how OpenQA does images matching, as suggested in the ticket description; and later, fallback on this brand new approach, solve the problems that come with it, and maintain it, if and only if the code that a number of major distros use does not work for us. For example, I suspect that OpenQA's implementation does is not affected by this load time issue. Now, of course, it may be that we quickly realize that the OpenQA implementation is very much non-reusable and we give up on it after spending 1-2h on it; but at this point we don't know this.

OpenQA also employs a straightforward application of OpenCV's matchTemplate(). :) At least we can look for inspiration in their choice of parameters.

I spent another 20 minute searching for image matching alternatives in Ruby-land, and while I can find some things for computer vision in general, none of them provide what we want (fuzzy image matching). I suspect that if more time is needed on this search, anything that comes up will be pretty obscure and hence likely worse in the end.

BTW, everything we need for the OpenCV python solution is available in Debian (python3-opencv python3-numpy python3-pil) so all-in-all it's starting to looks like pretty ok to me.

#57 Updated by intrigeri 2 months ago

Wrt. strategy and allocation of our resources, with my "person who manages the FT's budget" hat on: I'm a bit concerned about the costs that come with a NIH solution.

Yes, given the history with Sikuli (it was the reason for us initially using JRuby...) this is also my first priority.

Glad we're on the same page :)

In other words, the NIH-part is small in comparison to the whole work.

Okay. Without having thought about it much, I'm inclined to blindly trust your assessment here.

OpenQA also employs a straightforward application of OpenCV's matchTemplate(). :) At least we can look for inspiration in their choice of parameters.

OK!

I spent another 20 minute searching for image matching alternatives in Ruby-land, and while I can find some things for computer vision in general, none of them provide what we want (fuzzy image matching). I suspect that if more time is needed on this search, anything that comes up will be pretty obscure and hence likely worse in the end.

It's not 100% obvious to me that fuzzy image matching should be an absolute requirement here.

On the one hand, I do see some benefits in it. For example, since you added this feature, we've used it for at least 2 batch-updates: 1f02dda0211b44f0cf0bd59a689ff9af57164bbe, 33fa7c4ba6fe5aac2db07e477fc4307bd7de4d70. That's not nothing. OTOH, it's also not huge. IIRC we expected to get much more benefit out of this feature: you implemented it initially when we were playing with the idea of basing Tails on Debian testing, which would have caused much more churn in images, and in turn would have made fuzzy matching a huge time saver.

So, if there's a way to use Ruby, avoiding the complexity and potential fragility that come from bridging Ruby & Python somehow, at the cost of losing fuzzy matching, then IMO we should seriously consider this option.

Note, in passing, that a bug in our fuzzy matching feature (#17029) has made me, and possibly others, waste substantial amounts of time for 2 years, which greatly outweighed the time we saved thanks to batch-updates so far: I started suspecting something was fishy since ~1 year ago (3ac90f2ed9a544ce235bf7e520aafc3a50208c81, 670bd5f6fcb4b30159fe2cf81a4ffe89b67dca2a) but only tracked down & fixed the problem recently. But that's mostly off-topic because we'll learn from it and ensure that our next implementation of fuzzy matching won't have such problems :)

BTW, everything we need for the OpenCV python solution is available in Debian (python3-opencv python3-numpy python3-pil) so all-in-all it's starting to looks like pretty ok to me.

It's great to have this option if we have to, or decide to, fallback to Python!

#58 Updated by anonym 2 months ago

intrigeri wrote:

I spent another 20 minute searching for image matching alternatives in Ruby-land, and while I can find some things for computer vision in general, none of them provide what we want (fuzzy image matching). I suspect that if more time is needed on this search, anything that comes up will be pretty obscure and hence likely worse in the end.

It's not 100% obvious to me that fuzzy image matching should be an absolute requirement here.

Sikuli's image matching is always fuzzy, and and we use a default required similarity of 0.9 (sikuli_settings.MinSimilarity = 0.9), that is what I'm talking about. The test suite's --fuzzy-image-matching means that we will try with a couple lower similarity values (i.e. "more fuzz") to get a candidate.

So at the moment we do not know if a pixel-by-pixel perfect matcher is good enough because we have never done that but I guess we could bump the default required similarity to 1.0 and do a full run to get an idea. My intuition is that it will be bad, and I fear it could introduce new fun problems like image compression changing pixels ⇒ false negatives.

That said, I haven't seen any image matcher in Ruby-land except OpenCV. But I guess it would be pretty easy to implement (convert to bitmaps, slide the candidate picture around until all pixels match (success), or all possibilities are exhausted (failure)) and maintain ourselves, so that is not out of the question.

On the one hand, I do see some benefits in it. For example, since you added this feature, we've used it for at least 2 batch-updates: 1f02dda0211b44f0cf0bd59a689ff9af57164bbe, 33fa7c4ba6fe5aac2db07e477fc4307bd7de4d70. That's not nothing. OTOH, it's also not huge. IIRC we expected to get much more benefit out of this feature: you implemented it initially when we were playing with the idea of basing Tails on Debian testing, which would have caused much more churn in images, and in turn would have made fuzzy matching a huge time saver.

To be clear, I would happily sacrifice --fuzzy-image-matching for a native Ruby solution!

So, if there's a way to use Ruby, avoiding the complexity and potential fragility that come from bridging Ruby & Python somehow, at the cost of losing fuzzy matching, then IMO we should seriously consider this option.

Let's seriously consider implementing our own simple pixel-by-pixel equality matcher then!

Note, in passing, that a bug in our fuzzy matching feature (#17029) has made me, and possibly others, waste substantial amounts of time for 2 years, which greatly outweighed the time we saved thanks to batch-updates so far: I started suspecting something was fishy since ~1 year ago (3ac90f2ed9a544ce235bf7e520aafc3a50208c81, 670bd5f6fcb4b30159fe2cf81a4ffe89b67dca2a) but only tracked down & fixed the problem recently.

Eek. Your theory that the default similarity is forgotten in favor of the last explicitly used one could be true. The hack around the comment "Due to bugs in rjb we cannot re-throw Java exceptions" also looks suspicious. Any way...

But that's mostly off-topic because we'll learn from it and ensure that our next implementation of fuzzy matching won't have such problems :)

Indeed. It's worth noting that it is because rjb does an imperfect job at integrating Java into Ruby that implementing the hooks needed for --retry-find and --fuzzy-image-matching have been this error prone. So many hacks... it will be great to get rid of all that mess!

#59 Updated by intrigeri 2 months ago

intrigeri:

… and I've just noticed that the very part of our test suite I'm working on today does not behave as intended, precisely because Sikuli matches an image which is not the one we're waiting for. So on my branch for #17056 I've (locally) trying to bump SIKULI_MIN_SIMILARITY to 0.95 to avoid this problem, which might actually fix the painful bug I'm after.

Well, I'm getting sick of Sikuli (boom bada boom tchak, then the rest of the "Sick of Sikuli" song ;)

  • With SIKULI_MIN_SIMILARITY = 0.99, Sikuli still matches an image that is not (exactly) on screen AFAICT
  • With SIKULI_MIN_SIMILARITY = 1.0, Sikuli does not match an image that is exactly on screen AFAICT (updated it from a failure screenshot I got immediately before)

This suggests that either pixel-perfect image matching won't work for us, or that Sikuli's matching algorithm significantly differs from why my eyes and brain match.
At this point, I'm not sure we can actually check if pixel-perfect image matching can work for us merely by setting SIKULI_MIN_SIMILARITY = 1.0, because I'm losing the faith I had in the fact it would produce the result I understand it should produce :/

#60 Updated by intrigeri 2 months ago

Hi @anonym,

(I've sent this comment 5 days ago but just received the rejection message from Redmine, which disliked the facepalm emoji I had included.)

anonym wrote:

intrigeri wrote:

It's not 100% obvious to me that fuzzy image matching should be an absolute requirement here.

Sikuli's image matching is always fuzzy, and and we use a default required similarity of 0.9 (sikuli_settings.MinSimilarity = 0.9), that is what I'm
talking about.

Oops, I had forgotten this.
Looks like my misunderstanding started an interesting discussion and will make us consider another option, so it hopefully it was not only wasted time :)

So at the moment we do not know if a pixel-by-pixel perfect matcher is good enough because we have never done that

Absolutely.

but I guess we could bump the default required similarity to 1.0 and do a full run to get an idea.

I'd be very curious to see the results of such an experiment. Some of our images may be identical across test suite runs but still currently match only thanks to the
current 0.9 number; so we might need to update them to be pixel-perfect; and then the 2nd run and following will give us the answer we want here.

and I fear it could introduce new fun problems like image compression changing pixels ⇒ false negatives.

Just curious: where is image compression involved?

To be clear, I would happily sacrifice --fuzzy-image-matching for a native Ruby solution!

OK, case closed then.

So, if there's a way to use Ruby, avoiding the complexity and potential fragility that come from bridging Ruby & Python somehow, at the cost of losing fuzzy
matching, then IMO we should seriously consider this option.

Let's seriously consider implementing our own simple pixel-by-pixel equality matcher then!

If I understand correctly that the choice is indeed "a shiny new pile of hacks to bridge Python & Ruby, so that we can use OpenCV" vs. "our own simple pixel-by-pixel
equality matcher in Ruby", then yes, I'm leaning towards the latter. Of course, we first need to validate the hypothesis that pixel-perfect matches can work for us: as
you said, at this point we don't know.

#61 Updated by anonym 2 months ago

  • Assignee changed from lamby to anonym

intrigeri wrote:

This suggests that either pixel-perfect image matching won't work for us, or that Sikuli's matching algorithm significantly differs from why my eyes and brain match.
At this point, I'm not sure we can actually check if pixel-perfect image matching can work for us merely by setting SIKULI_MIN_SIMILARITY = 1.0, because I'm losing the faith I had in the fact it would produce the result I understand it should produce :/

Yeah, this was a bad idea from me, sorry! Related: OpenCV's matchTemplate() fails with 1.0, so it can only be fuzzy.

Let's seriously consider implementing our own simple pixel-by-pixel equality matcher then!

If I understand correctly that the choice is indeed "a shiny new pile of hacks to bridge Python & Ruby, so that we can use OpenCV" vs. "our own simple pixel-by-pixel

equality matcher in Ruby", then yes, I'm leaning towards the latter. Of course, we first need to validate the hypothesis that pixel-perfect matches can work for us: as
you said, at this point we don't know.

I propose this: we put in the work of the machinery needed around the image matching primitive (i.e. remove Sikuli, re-implement some basic versions of classes like Screen, Match, including methods like wait(), findAny() etc (our primitive only gives us the equivalent of exists(), something like @findfailed_hook() etc). Since we want to move away from Sikuli it seems like this is work we must do any way. Once we have all that we have done most of the work (according to my estimation, at least), so it makes sense to do the extra work to evaluate both OpenCV's matchTemplate() and a homegrown pixel-perfect matcher.

Thoughts?

(Also, assigning to me now that we seem to have given up on Sikuli and lamby won't work on this.)

#62 Updated by intrigeri 2 months ago

I propose this: we put in the work of the machinery needed around the image matching primitive (i.e. remove Sikuli, re-implement some basic versions of classes like Screen, Match, including methods like wait(), findAny() etc (our primitive only gives us the equivalent of exists(), something like @findfailed_hook() etc). Since we want to move away from Sikuli it seems like this is work we must do any way. Once we have all that we have done most of the work (according to my estimation, at least), so it makes sense to do the extra work to evaluate both OpenCV's matchTemplate() and a homegrown pixel-perfect matcher.

Thoughts?

Sounds good to me.

(Also, assigning to me now that we seem to have given up on Sikuli and lamby won't work on this.)

This sounds fair enough to me, given lamby clearly expressed he'd rather not to work directly on our test suite framework, and the current proposal will happen there.
@lamby, I'm sure we can find more exciting Tails work for you than this :)

#63 Updated by intrigeri 10 days ago

  • Blocks Bug #17308: Uninstall libsikulixapi-java on isotesters added

Also available in: Atom PDF