Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof-of-concept draft for teleport-update binary implementation #46357

Closed
wants to merge 13 commits into from

Conversation

sclevine
Copy link
Member

@sclevine sclevine commented Sep 6, 2024

This PR contains an initial draft of the teleport-update binary described in #40190.

Not all functionality is implemented, only:

  • Atomic downloading and extraction of the Teleport tgz to /var/lib/teleport/versions/X.Y.Z, with disk space checks, checksum verification, and protection against power loss scenarios.
  • Atomic configuration management of updates.yaml

This PR will not be merged as-is, and is only open for early review of the approach.

I plan to break this download into separate, tested PRs if this initial version looks good.

root@71aeb39eabec:/# ./teleport/teleport-update enable --proxy levine.teleport.sh --debug
2024-09-06T20:09:50Z DEBU  Attempting request to Proxy web api method:GET host:levine.teleport.sh path:/webapi/find webclient/webclient.go:134
2024-09-06T20:09:51Z INFO [UPDATER]   Downloading Teleport tarball path:/tmp/teleport-update-1578638652 size:153512064 teleport-update/versions.go:210
2024-09-06T20:10:01Z INFO  extracted tarball into /var/lib/teleport/versions/16.2.0: 582 files, 93 dirs (2.678702627s)

Comment on lines 175 to +180
endpoint := fmt.Sprintf("https://%s/webapi/find", cfg.ProxyAddr)

if cfg.Group != "" {
endpoint = fmt.Sprintf("%s?group=%s", endpoint, cfg.Group)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The group content is not urlencoded and the query building is fragile. The url lib allows to do this safely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, totally agree, was just using this to test

Comment on lines +50 to +63
func (tv *TeleportVersion) Remove(version string) error {
versionPath := filepath.Join(tv.VersionsDir, version)
sumPath := filepath.Join(versionPath, checksumType)

// invalidate checksum first, to protect against partially-removed
// directory with valid checksum.
if err := os.Remove(sumPath); err != nil {
return trace.Wrap(err)
}
if err := os.RemoveAll(versionPath); err != nil {
return trace.Wrap(err)
}
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should not allow to delete the current active version.

sum, err := hex.DecodeString(raw)
if err != nil {
plog.WarnContext(ctx, "corrupt checksum detected", "size", n, "checksum", raw)
return nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning nil while we failed to read the checksum is not consistent with the other error return paths of the function: e.g. we can't open the file. If the goal is to continue the execution, it would be clearer to return an error and handle it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, will add a typed error in the separate PR for this function.

}
raw := buf.String()
if n != 64 {
plog.WarnContext(ctx, "unexpected checksum size", "size", n, "checksum", raw)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unexpected size should cause an immediate failure, there is no need to try to decode a buffer that has the wrong length.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will cover this with a typed error (same as below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be replaced with io.CopyN which actually does the same, and returns EOF if n is less than expected

return nil, nil, trace.Errorf("size of download (%d bytes) exceeds available disk space (%d bytes)", resp.ContentLength, free)
}

// Calculate checksum concurrently with download.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation behind checksuming as we download versus reading the file back from the fs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No downside to it, and it avoids needed to re-read the file from disk

@sclevine sclevine added the no-changelog Indicates that a PR does not require a changelog entry label Sep 10, 2024
@sclevine sclevine changed the title Draft of teleport-update binary implementation Proof-of-concept draft for teleport-update binary implementation Sep 10, 2024
@sclevine
Copy link
Member Author

Completed by #47565

@sclevine sclevine closed this Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Indicates that a PR does not require a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants