Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize retrieving proxies from store #24

Open
JaredLGillespie opened this issue Nov 9, 2019 · 0 comments
Open

Optimize retrieving proxies from store #24

JaredLGillespie opened this issue Nov 9, 2019 · 0 comments

Comments

@JaredLGillespie
Copy link
Owner

When retrieving a proxy, the following occurs:

  1. If refresh is needed, scrape new proxies from sources
  2. Filter out each proxy from blacklist
  3. Pick a random proxy and return it

If a refresh doesn't occur, and the blacklist doesn't change, then step 1 + 2 should be skipped. Also, we can use inverted indexes for filtering based on anonymity, country code, etc. Then performing intersections (or unions) on the sets provided by the inverted indexes should yield the pool of applicable proxies. This should drastically improve performance when a large number of proxies are retrieved.

Also, please add methods for retrieving the number of proxies currently stored. Method should optionally take a filter for retrieving the number of proxies that this matches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant