Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lang: New file-hashing functions #20115

Merged
merged 2 commits into from
Jan 25, 2019
Merged

lang: New file-hashing functions #20115

merged 2 commits into from
Jan 25, 2019

Conversation

apparentlymart
Copy link
Contributor

In prior versions, we recommended using hash functions in conjunction with the file function as an idiom for detecting changes to upstream blobs without fetching and comparing the whole blob.

That approach relied on us being able to return raw binary data from file(...). Since Terraform strings pass through intermediate representations that are not binary-safe (e.g. the JSON state), there was
a risk of string corruption in prior versions which we have avoided for 0.12 by requiring that file(...) be used only with UTF-8 text files.

The specific case of returning a string and immediately passing it into another function was not actually subject to that corruption risk, since the HIL interpreter would just pass the string through verbatim, but this is still now forbidden as a result of the stricter handling of file(...).

To avoid breaking these use-cases, here we introduce variants of the hash functions a with file prefix that take a filename for a disk file to hash rather than hashing the given string directly. The configuration upgrade tool also now includes a rule to detect the documented idiom and rewrite it into a single function call for one of these new functions.

This does cause a bit of function sprawl, but that seems preferable to introducing more complex rules for when file(...) can and cannot read binary files, making the behavior of these various functions easier to
understand in isolation. I also considered making functions that take base64 strings and hash that, for use in conjunction with filebase64, but it's not clear that much flexibility is required (so far we've been doing this only to sync upstream blobs with local files) and it would cause some confusion of names given that some of these hash functions already produce base64 and so have base64 in their names. Therefore this new set of file... hash functions felt like the least bad of a set of non-ideal choices.

This fixes #20098, though will also require some documentation updates in providers to show the new idiom (using the relevant file hashing function directly) while also including a callout showing the old idiom for those still using Terraform 0.11.

These all follow the pattern of creating a hash and converting it to a
string using some encoding function, so we can write this implementation
only once and parameterize it with a hash factory function and an encoding
function.

This also includes a new test for the sha512 function, which was
previously missing a test and, it turns out, actually computing sha256
instead.
@apparentlymart apparentlymart added this to the v0.12.0 milestone Jan 25, 2019
@apparentlymart apparentlymart self-assigned this Jan 25, 2019
@apparentlymart apparentlymart requested a review from a team January 25, 2019 01:17
Copy link
Contributor

@mildwonkey mildwonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one tiny doc note that doesn't block approval.

In prior versions, we recommended using hash functions in conjunction with
the file function as an idiom for detecting changes to upstream blobs
without fetching and comparing the whole blob.

That approach relied on us being able to return raw binary data from
file(...). Since Terraform strings pass through intermediate
representations that are not binary-safe (e.g. the JSON state), there was
a risk of string corruption in prior versions which we have avoided for
0.12 by requiring that file(...) be used only with UTF-8 text files.

The specific case of returning a string and immediately passing it into
another function was not actually subject to that corruption risk, since
the HIL interpreter would just pass the string through verbatim, but this
is still now forbidden as a result of the stricter handling of file(...).

To avoid breaking these use-cases, here we introduce variants of the hash
functions a with "file" prefix that take a filename for a disk file to
hash rather than hashing the given string directly. The configuration
upgrade tool also now includes a rule to detect the documented idiom and
rewrite it into a single function call for one of these new functions.

This does cause a bit of function sprawl, but that seems preferable to
introducing more complex rules for when file(...) can and cannot read
binary files, making the behavior of these various functions easier to
understand in isolation.
@ghost
Copy link

ghost commented Jul 24, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Jul 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New File Hashing Errors In Terraform 0.12
2 participants