Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the Hyper library with HTTPX #162

Merged
merged 3 commits into from
Feb 5, 2020
Merged

Conversation

PrimozGodec
Copy link
Collaborator

@PrimozGodec PrimozGodec commented Nov 7, 2019

Issue

Fixes #146

Description of changes

With this pull request, we replace the buggy Hyper library with HTTPX.
Server embedder is now implemented as an async client which means that it is not necessary to send images in batches anymore.

This implementation improves the speed of embedding. For example embedding times of 2949 YPLP:

  • old Hyper client: ~380 s
  • new HTTPX client: ~130 s

The current implementation uses HTTP 1.1 since HTTPX still has bugs (encode/httpx#551, encode/httpx#514) in HTTP 2 that prevent us to use it. Once they fix bugs, we can move to HTTP 2 easily by changing one line of code.

@matjazp what do you think, should we merge this even it does not implement HTTP 2? The speed improvement is quite great. If we decide to merge it, we should fix HTTPX to ~=0.9.3 since the library is still in the alpha phase and some unexpected change can case troubles.

Before testing reinstall ImageAnalytics since new package must be installed

Includes
  • Code changes
  • Tests
  • Documentation

@PrimozGodec PrimozGodec force-pushed the httpx branch 3 times, most recently from a13041f to 2940bb0 Compare November 11, 2019 13:08
@PrimozGodec PrimozGodec force-pushed the httpx branch 2 times, most recently from 90eaf2d to 6783727 Compare December 9, 2019 10:51
@codecov
Copy link

codecov bot commented Dec 9, 2019

Codecov Report

Merging #162 into master will increase coverage by 2%.
The diff coverage is 98.92%.

@@          Coverage Diff          @@
##           master   #162   +/-   ##
=====================================
+ Coverage   65.99%    68%   +2%     
=====================================
  Files          13     13           
  Lines        3238   3147   -91     
  Branches      482    451   -31     
=====================================
+ Hits         2137   2140    +3     
+ Misses        967    890   -77     
+ Partials      134    117   -17

@ajdapretnar
Copy link
Collaborator

I am getting a JSONDecodeError on openface.
Errors are not cleared when a new embedder is selected and started.

@PrimozGodec PrimozGodec force-pushed the httpx branch 2 times, most recently from 3531ed6 to f28812d Compare December 23, 2019 08:11
@PrimozGodec
Copy link
Collaborator Author

@ajdapretnar error is fixed now.

@PrimozGodec
Copy link
Collaborator Author

Tests failing due to error in Orange biolab/orange3#4320

@PrimozGodec
Copy link
Collaborator Author

@ajdapretnar made changes connected with the progress bar here. Can you test again when you will have time.

@ajdapretnar
Copy link
Collaborator

Things look ok now.
Just one Q. When I disconnect the internet, with VGG for example, the embedder stalls (stays at the same progress). Should it not check the connection occasionally?
Also, it looks like some embedders are cached locally. Is this so? When I am offline, I can still use the embedders I used before, but for new ones I get the 'Internet disconnected' error.

@PrimozGodec
Copy link
Collaborator Author

The behavior regarding the internet connection should be like this:

  • When no internet connection at the start: Widget tries to send few images (the most 10) and it fails directly so the embedder is switched to local SqueezeNet. In my case, it happens immediately.
  • When the internet connection is lost after the start of the embedding: Widget switches to local SqueezeNet after the most 10 images causing errors. It can take up to 30 seconds.

In most cases, the widget would be without connection in the beginning already and switch is fast. The connection can be lost during embedding (but this case is very rare). In this case, the waiting time of 30 seconds is not so long in my opinion since usually, it would happen between embedding of a large set of images which already takes a few minutes.

Before I was checking OSError which happens when sending images and there is no connection, now I also add checking ReadTimeout which usually happens when lost connection on slower embedders like VGG (it was not handled correctly before). So now widget should detect connection lost correctly. Widget always needs to switch to local embedder when no connection. If it does not do so it is a bug. In my case, everything works well now. @ajdapretnar can you please test again and confirm that it also works for you. It would be also good if it can be tested on Windows since there can be different behavior regarding OSError.

With this version, we do not ping server to check the connection anymore, since it is not nice and implementation can be more flexible. In case that embeddings are in the local cache embeddings can be calculated without connection.

@ajdapretnar
Copy link
Collaborator

Everything works well now, but can't merge due to failing tests.

@ajdapretnar
Copy link
Collaborator

p.s. I'll try on Windows, too.

@PrimozGodec
Copy link
Collaborator Author

Now I fixed:

  • problem with switching to the local embedder when it is not necessary. New error recognition strategy is implemented, timeouts are increased, and VGG, Painters, DeepLoc sends fewer images at the same time - it does not decrease the speed. I tested all embedders on huge datasets of images and it looks like working good now.
  • I fixed an error with text not updating when switched to local embedder.
  • Tests are still failing since fix is not merged in Orange master yet Concurrent: Prevent running new task from on_done, on_exception orange3#4320

@PrimozGodec
Copy link
Collaborator Author

I fixed tests and changed the strategy for embedding with different embedding.
@ajdapretnar Can you test this again when you will have some time?

@ajdapretnar ajdapretnar merged commit 555bd3a into biolab:master Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hyper alternative
2 participants