Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go getter #6999

Merged
merged 1 commit into from
Mar 13, 2019
Merged

go getter #6999

merged 1 commit into from
Mar 13, 2019

Conversation

azr
Copy link
Contributor

@azr azr commented Nov 14, 2018

The intent of this PR is to start using hashicorp/go-getter to download files.

The go getter understands a bit more protocols and will do a lot of work we were doing manually.

I'm opening this PR early to check if this is okay with everyone.

I removed the packer.Cache struct and usages as it's functions were only used in StepDownload and replaced it by a packer.CachePath func, that is only used in StepDownload.

close #6991
close #7202
close #5621
close #6443
close #7288
close #7309

@@ -51,7 +51,7 @@ func (b *Builder) Prepare(raws ...interface{}) ([]string, error) {
return warnings, errs
}

func (b *Builder) Run(ui packer.Ui, hook packer.Hook, cache packer.Cache) (packer.Artifact, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -71,7 +71,7 @@ func (b *Builder) Prepare(raws ...interface{}) ([]string, error) {
return nil, nil
}

func (b *Builder) Run(ui packer.Ui, hook packer.Hook, cache packer.Cache) (packer.Artifact, error) {
func (b *Builder) Run(ui packer.Ui, hook packer.Hook) (packer.Artifact, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@@ -82,6 +82,8 @@ builder.
- `iso_url` (string) - A URL to the ISO containing the installation image.
This URL can be either an HTTP URL or a file URL (or path to a file). If
this is an HTTP URL, Packer will download it and cache it between runs.
Packer uses hashicorp/go-getter in file mode in order to perform a download.
See usage of [hashicorp/go-getter](https://github.com/hashicorp/go-getter).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay to link go-getter docs here, or should I instead describe what can packer do ?

@azr
Copy link
Contributor Author

azr commented Jan 22, 2019

Howdy @SwampDragons & @rickard-von-essen, I think this is ready for review. If you have time ! 🙏

To recap what I did :

  • removed the packer.Cache struct and usages as it's functions were only used in StepDownload and replaced it by a packer.CachePath func, that is only used in StepDownload. ( this simplification will also simplify our job to go to hcl2, I believe )
  • mirrored the downloading features of packer in the hashicorp/go-getter
  • files are downloaded under hash(file_url+file_hash) in the cache folder; this should lower the chances of collision
  • downloads are protected by a lockfile named hash(file_url+file_hash).lock in the cache folder; this should prevent things like Packer processes download the same iso in parallel #6991
  • updated the docs
  • updated go.mod and vendors
    • go mod vendor also removed huge number of fixture/fuzz files in vendor/github.com/pierrec/lz4/fuzz/corpus/

@azr
Copy link
Contributor Author

azr commented Jan 22, 2019

Ah, I just realised that by downloading to hash(file_url+sha), if the sha was wrong , changing the sha invalidates the cache. I still think this is ok, since this will happen on edge cases, but we will probably get a few issues opened for this.

Edit: I think this is very right after reading this: #6443 (comment)

@SwampDragons SwampDragons added this to the 1.3.5 milestone Jan 23, 2019
@azr
Copy link
Contributor Author

azr commented Feb 25, 2019

Hey @ladar 🙂

Yes because the same url is used twice: if the first download fails, packer will delete the file from the cache, then the second dl will be a 'fresh' download. I think this gives more guaranty that the download/build will work in your daily case for example.

Can you use a hash of the contents provided by iso_checksum_url as your filename? Instead of hashing just the URL?

Not yet, we'd have to change the go-getter for this but this would constitute a breaking change, meaning I have to do an internal RFC because many projects are using the go-getter, etc.

Details
package getter
// [...]
func (c *Client) GetFile() error {}
// ^ this would need to change to something like

// GetOutcome tells where the file was downloaded in 
// case the checksum was used to name the destination file.
func (c *Client) GetFile() (GetOutcome, error) {}

I hope this happens soon ! 🙂

I don't see why an auto retry wouldn't be safe, presuming packer was trying to resume a download, and the checksum fails? You could add a command line option to enable/disable this if you're worried.

I agree this would be nice to have but I just think this PR is already way too big and I think this would constitute another PR. Scoping it will make sure the code review will be easier and that the behaviour of such a feature is well defined. 🙂

@azr
Copy link
Contributor Author

azr commented Feb 25, 2019

Please wait for review I'd like to add more unit tests !

@azr
Copy link
Contributor Author

azr commented Feb 25, 2019

There, this should be good ! 🙂

@SwampDragons
Copy link
Contributor

Looks like there are some conflicts to be resolved, so I'll wait until tomorrow to review.

@azr
Copy link
Contributor Author

azr commented Feb 26, 2019

There, this looks better 🙂

@azr
Copy link
Contributor Author

azr commented Feb 26, 2019

Ah, wait before reviewing. I found another locking issue.
Edit: okay a quick refactor solved it.

@azr azr changed the title go getter [WIP] go getter Feb 26, 2019
@azr azr changed the title [WIP] go getter go getter Feb 26, 2019
@SwampDragons SwampDragons modified the milestones: 1.3.5, v1.4.0 Feb 28, 2019
@azr
Copy link
Contributor Author

azr commented Mar 6, 2019

Okay I used the latest go-getter tag in the vendors which imported a lot of vendors - that I also checked in - they come from the fact that the go-getter now understands gcs links.

@SwampDragons
Copy link
Contributor

We're gonna want to squash before merging this.

@azr
Copy link
Contributor Author

azr commented Mar 12, 2019

Yes I wouldn't mind with all these 103 commits ! >.< squash and merge from github ? or do you think I should rather squash & force push ?

@SwampDragons
Copy link
Contributor

I prefer squash and force push but I think that's entirely a matter of taste.

* removed packer.Cache and references since packer.Cache is never used except in the download step. The download step now uses the new func packer.CachePath(targetPath) for this, the behavior is the same.
* removed download code from packer that was reimplemented into the go-getter library: progress bar, http download restart, checksuming from file, skip already downloaded files, symlinking, make a download cancellable by context.
* on windows if packer is running without symlinking rights and we are getting a local file, the file will be copied instead to avoid errors.
* added unit tests for step_download that are now CI tested on windows, mac & linux.
* files are now downloaded under cache dir `sha1(filename + "?checksum=" + checksum) + file_extension`
* since the output dir is based on the source url and the checksum, when the checksum fails, the file is auto deleted.
* a download file is protected and locked by a file lock,
* updated docs
* updated go modules and vendors
@azr
Copy link
Contributor Author

azr commented Mar 13, 2019

Here we go ! I think it's nice to have a single commit also because it allowed me to describe everything happening in there 🙂

@SwampDragons SwampDragons merged commit e4a189c into master Mar 13, 2019
@azr azr deleted the gogetter branch March 14, 2019 09:16
@azr
Copy link
Contributor Author

azr commented Mar 14, 2019

yay ! 😄

@ghost
Copy link

ghost commented Mar 29, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Mar 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
7 participants