Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[teleport-update] Set umask 0022 for teleport-update to avoid errors on enable #52725

Merged
merged 14 commits into from
Mar 4, 2025

Conversation

sclevine
Copy link
Member

@sclevine sclevine commented Mar 3, 2025

If the local shell has an unusual umask, teleport-update will refuse to complete teleport-update enable:

2025-03-03T20:24:36.890Z INFO [UPDATER]   Initiating installation. target_version:17.3.0-dev.hugoau.7+Enterprise active_version: agent/updater.go:362
2025-03-03T20:24:37.088Z INFO [UPDATER]   Downloading Teleport tarball. url:https://cdn.cloud.gravitational.io/teleport-ent-v17.3.0-dev.hugoau.7-linux-arm64-bin.tar.gz size:188521662 agent/installer.go:295
2025-03-03T20:24:37.876Z INFO [UPDATER]   Downloading file:teleport-ent-v17.3.0-dev.hugoau.7-linux-arm64-bin.tar.gz progress:20% agent/logger.go:49
2025-03-03T20:24:38.546Z INFO [UPDATER]   Downloading file:teleport-ent-v17.3.0-dev.hugoau.7-linux-arm64-bin.tar.gz progress:40% agent/logger.go:49
2025-03-03T20:24:39.051Z INFO [UPDATER]   Downloading file:teleport-ent-v17.3.0-dev.hugoau.7-linux-arm64-bin.tar.gz progress:60% agent/logger.go:49
2025-03-03T20:24:39.683Z INFO [UPDATER]   Downloading file:teleport-ent-v17.3.0-dev.hugoau.7-linux-arm64-bin.tar.gz progress:80% agent/logger.go:49
2025-03-03T20:24:40.215Z INFO [UPDATER]   Downloading file:teleport-ent-v17.3.0-dev.hugoau.7-linux-arm64-bin.tar.gz progress:100% agent/logger.go:49
2025-03-03T20:24:40.215Z INFO [UPDATER]   Download complete. duration:3.227484932s size:188521662 agent/installer.go:323
2025-03-03T20:24:43.273Z INFO [UPDATER]   Extracting Teleport tarball. path:/opt/teleport/default/versions/17.3.0-dev.hugoau.7_ent size:761835520 agent/installer.go:343
2025-03-03T20:24:48.235Z WARN [UPDATER]   Found unexpected non-executable file name:fdpass-teleport agent/validate.go:96
2025-03-03T20:24:48.236Z WARN [UPDATER]   Found unexpected non-executable file name:tbot agent/validate.go:96
2025-03-03T20:24:48.236Z WARN [UPDATER]   Found unexpected non-executable file name:tctl agent/validate.go:96
2025-03-03T20:24:48.236Z WARN [UPDATER]   Found unexpected non-executable file name:teleport agent/validate.go:96
2025-03-03T20:24:48.236Z WARN [UPDATER]   Found unexpected non-executable file name:teleport-update agent/validate.go:96
2025-03-03T20:24:48.236Z WARN [UPDATER]   Found unexpected non-executable file name:tsh agent/validate.go:96
2025-03-03T20:24:48.236Z ERRO [UPDATER]   Command failed. error:[
ERROR REPORT:
Original Error: *errors.errorString no binaries available to link
Stack Trace:
github.com/gravitational/teleport/lib/autoupdate/agent/insta

This PR changes the umask of the whole teleport-update process to ensure all operations are consistent with teleport-update when run by systemd (which always uses 0022). This would be undesirable for most CLIs, but teleport-update should never create files as a normal user, or create files in normal user directories.

As a workaround until this PR is merged, users can run umask 0022 before teleport-update enable.

--

changelog: Allow teleport-update to be used in shells that set a restrictive umask

--

The teleport-update binary will be used to enable, disable, and trigger automatic Teleport agent updates. The new auto-updates system manages a local installation of the cluster-specified version of Teleport stored in /opt/teleport.

RFD: #47126
Goal (internal): https://github.com/gravitational/cloud/issues/11856

@sclevine sclevine requested review from hugoShaka and vapopov March 3, 2025 21:30
@github-actions github-actions bot requested review from codingllama and rudream March 3, 2025 21:30
@fheinecke
Copy link
Contributor

Is this really something that we should be setting? A umask value of 0022 is pretty standard, and if it's not 0022 then having it set to some other value is probably intentional.

@sclevine sclevine requested a review from espadolini March 3, 2025 21:36
@sclevine
Copy link
Member Author

sclevine commented Mar 3, 2025

Is this really something that we should be setting? A umask value of 0022 is pretty standard, and if it's not 0022 then having it set to some other value is probably intentional.

I'm definitely on the fence about this. I went this direction because teleport-update never touches user files, just files in /opt and /usr/local, and it's behavior should always be consistent with the systemd service behavior. I could modify utils.Extract and many other places where teleport-update writes files to os.Chmod after creating, but I worry that future additions to teleport-update might miss this.

@fheinecke
Copy link
Contributor

There's also some risk that if (not necessarily now but down the line as the code changes), an attacker may be able to exploit the updater in some way to create executable files. This would probably be why an admin would remove the x bit from the umask in the first place. If this is the case, this would essentially be the teleport-updater saying "I know better/more about your environment than you" to the sysadmin, which is pretty much never going to be the case.

Personally I'm of the opinion that if most systems default to 0022, then we should build to be compatible with that. If a sysadmin changes the default, then it's up to them (for all applications) to ensure compatibility.

@sclevine
Copy link
Member Author

sclevine commented Mar 3, 2025

Personally I'm of the opinion that if most systems default to 0022, then we should build to be compatible with that. If a sysadmin changes the default, then it's up to them (for all applications) to ensure compatibility.

This is an argument for not fixing the bug. We could just fail on unmask != 0022, with a more clear error.

For prior-art though, @espadolini points out that other package management tools do this:
https://github.com/rpm-software-management/rpm/blob/aab013a8a2c83e088f1c32ccabb44613afebb357/lib/transaction.cc#L1726-L1727

@sclevine
Copy link
Member Author

sclevine commented Mar 3, 2025

I can definitely add a loud warning when we change the umask, if the old umask was different. Would that reduce concerns about unexpected behavior?

@fheinecke
Copy link
Contributor

fheinecke commented Mar 3, 2025

For prior-art though, @espadolini points out that other package management tools do this

That's great, and is probably sufficient to set this, if we and customers are considering the updater a package manager. We had a discussion awhile back as to whether or not the new updater system is a package manager system. It was de facto decided that the updater is not a package manager, otherwise, we should be placing binaries /usr/bin.

Would that reduce concerns about unexpected behavior?

It would, though not outright eliminate them. If we're considering the updater system to be a package manager system, then I think a warning + set would be fine. If not, IMO we should either warn and either continue or fail, and leave it to the sysadmin to.

On a more technical level, if we do want to change the mask, then limiting the scope of the umask change to specific logic (rather than the entire program) is probably the safest way to go.

@sclevine
Copy link
Member Author

sclevine commented Mar 3, 2025

Hm, we could fail unless you pass --allow-ignore-umask. Then I could move the umask change into the subcommands in main.go that write files, so that status doesn't require it. Any preference between warning and proceeding vs. requiring a flag? I do worry that the flag makes the UX worse, so I lean towards the first option.

@fheinecke
Copy link
Contributor

fheinecke commented Mar 3, 2025

Maybe a flag + env (with either one allowing for the umask override)? My concern is that the teleport-update will end up buried in a script, ansible playbook, etc. and nobody will see it, and assume that their changed umask is covering everything. Needing to explicitly tell the program to override it ensures that sysadmins are aware of what's going on.

Completely get that this user interface is a bit worse, but IMO if an admin is changing the umask then they're accepting that things will not always work properly the first time.

Of course I am just one datapoint. If you've heard otherwise from customers then we should probably defer to them.

@sclevine
Copy link
Member Author

sclevine commented Mar 3, 2025

@fheinecke how does that look?

@sclevine sclevine requested review from fheinecke and removed request for codingllama and rudream March 3, 2025 23:46
@sclevine
Copy link
Member Author

sclevine commented Mar 4, 2025

I reached out the customer who reported this issue for feedback. Requiring the flag on each invocation would be disruptive UX, so I removed it.

@sclevine sclevine requested a review from zmb3 March 4, 2025 00:20
Copy link
Collaborator

@zmb3 zmb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with dropping the flag so long as we keep the warning message.

Just because it was decided that the updater is theoretically not a package manager doesn't mean it doesn't perform tasks similar to that of a package manager and doesn't mean we can't take inspiration from prior art.

@sclevine sclevine added this pull request to the merge queue Mar 4, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 4, 2025
Copy link
Contributor

@fheinecke fheinecke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good enough to ship to fix customer breakage. Would appreciate if you reviewed the comments after the PR, and a fix is less urgently needed.

I reached out the customer who reported this issue for feedback. Requiring the flag on each invocation would be disruptive UX, so I removed it.

Sounds great, thanks for asking them! If that's what customers want, then we should do it that way.

@@ -184,6 +184,13 @@ func Run(args []string) int {
return 1
}

// Set required umask for most commands, and warn loudly if it changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be set for all commands or just ones that install or update? I'm not familiar enough with the tool to know where this is and is not needed.

Alternatively, can this more tightly wrap the actual filesystem write calls? This would reduce the likelihood of a vulnerability in the future where an attacker is able to use the updater to write executable files.

Copy link
Member Author

@sclevine sclevine Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be set for all commands or just ones that install or update?

It needs to be set for all commands that must run as root and write system files, so I scoped it to just those commands.
I updated the godoc to clarify this as well. We can make sure that it doesn't get set for user commands.

(This is similar to systemd, in a way. Systemd system services always get Umask = 0022 unless they explicitly define their own umask. There is no way for root to change that behavior. However, user services respect the user umask.)

Alternatively, can this more tightly wrap the actual filesystem write calls?

This is tricky, because the umask is process-global, and changing it impacts all goroutines in a thread-unsafe manner. I agree with your point about syscall.Umask having important security implications, and I want to avoid calling it in importable code that could impact other goroutines by silently modifying their file permissions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be feasible, but could all functions that could be feasibly impacted by this pass around a mutex, where if a given function held the lock, it could set/unset the umask? Or maybe there could be two "groups" of functions - one privileged and one not - that would share this mutex.

@sclevine sclevine enabled auto-merge March 4, 2025 05:59
@sclevine sclevine added this pull request to the merge queue Mar 4, 2025
Merged via the queue into master with commit 2ce5577 Mar 4, 2025
40 checks passed
@sclevine sclevine deleted the sclevine/teleport-update-umask branch March 4, 2025 17:05
@public-teleport-github-review-bot

@sclevine See the table below for backport results.

Branch Result
branch/v17 Create PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants