Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use one flag to cache all third party data in one directory #131

Merged
merged 7 commits into from
Nov 7, 2017

Conversation

konklone
Copy link
Collaborator

@konklone konklone commented Nov 5, 2017

This removes the individual flags for the preload list and PSL, and makes one new flag that covers the preload list, PSL, and the preload pending list.

In addition to making @lgarron and the hstspreload.org servers a lot happier, this will make it easier for people/tools that make use of pshtt to rationalize their caching strategy, and easier for the pshtt maintainers to add new third party data sources without adding new flags and approaches to caching.

Fixes #99.

@konklone
Copy link
Collaborator Author

konklone commented Nov 5, 2017

Also, I should note -- this removes the older flags (--suffix-cache and --preload-cache), and so people making use of those two flags will find that caching no longer functions. However, I think I am the only one making use of it! And it seemed worth clearing up flag cruft now, while the tool is still young. So I will plan to make a coordinated update on my end once a new version of pshtt is published with this addition, assuming it's merged.

-u --user-agent=AGENT Override user agent.
-t --timeout=TIMEOUT Override timeout (in seconds).
-c --cache-third-parties=DIR Cache third party data, and what directory to cache it in.
-f --ca-file=PATH Specify custom CA bundle (PEM format)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the changes here are from syncing the options in cli.py with the options documented in the README, which required making changes in both directions.

@@ -3,22 +3,20 @@
"""pshtt ("pushed") is a tool to test domains for HTTPS best practices.

Usage:
pshtt (INPUT ...) [--output OUTFILE] [--sorted] [--json] [--markdown] [--debug] [--timeout TIMEOUT] [--user-agent AGENT] [--preload-cache PRELOAD] [--cache] [--suffix-cache SUFFIX] [--ca-file PATH]
pshtt (INPUT ...) [--output OUTFILE] [--sorted] [--json] [--markdown] [--debug] [--timeout TIMEOUT] [--user-agent AGENT] [--cache-third-parties DIR] [--ca-file PATH]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also synced the options between the Usage: line here and the list below, as they were slightly out of sync.

@@ -107,9 +105,7 @@ def main():
options = {
'user_agent': args['--user-agent'],
'timeout': args['--timeout'],
'preload_cache': args['--preload-cache'],
'suffix_cache': args['--suffix-cache'],
'cache': args['--cache'],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --cache flag has been unused for a long time -- it's a remnant of a time when this library used requests-cache, but that was removed a long time ago.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For info on that, see requests-cache/requests-cache#70.

preload_pending = None

# Used for determining base domain via Mozilla's public suffix list.
SUFFIX_CACHE = None
PUBLIC_SUFFIX_CACHE = None
PUBLIC_SUFFIX_CACHE_DEFAULT = "public-suffix-list.txt"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't provide a way to override the individual filenames within the third party cache directory. Though options are always nice, it doesn't seem like it has enough utility to merit the complexity.

@@ -1015,28 +1022,38 @@ def did_domain_error(domain):
)


def fetch_preload_pending():
def load_preload_pending():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standardized on load_ as the prefix for these 3 loading/caching functions, since they were split among load_, create_, and fetch_ before, which seemed slightly confusing.

Copy link
Member

@jsf9k jsf9k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I ran a few tests and everything seems to be working. This is a simpler and better way to handle caching.

@konklone
Copy link
Collaborator Author

konklone commented Nov 7, 2017

Thanks for the review, @jsf9k!

Next up is a PR letting third party cache data be passed in via variable, to support environments like Lambda where you can reuse in-memory data across containers.

@konklone konklone merged commit b74e4cb into master Nov 7, 2017
@konklone konklone deleted the rationalize-caching branch November 7, 2017 18:45
@lgarron
Copy link

lgarron commented Nov 7, 2017

Next up is a PR letting third party cache data be passed in via variable, to support environments like Lambda where you can reuse in-memory data across containers.

$XDG_CACHE_HOME, plz? :-D

cisagovbot pushed a commit that referenced this pull request Jul 30, 2024
Fix the dependabot ignore directive for `github/codeql-action`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants