Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli,clusterversion: high cluster version count breaks debug zips #77579

Closed
irfansharif opened this issue Mar 9, 2022 · 11 comments · Fixed by #77637
Closed

cli,clusterversion: high cluster version count breaks debug zips #77579

irfansharif opened this issue Mar 9, 2022 · 11 comments · Fixed by #77637
Assignees
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.

Comments

@irfansharif
Copy link
Contributor

irfansharif commented Mar 9, 2022

Describe the problem

#77337 introduces two cluster versions, and observed in CI that doing so partially breaks debug zip collection of the system tables.

To Reproduce

To repro, apply:

diff --git i/pkg/clusterversion/cockroach_versions.go w/pkg/clusterversion/cockroach_versions.go
index 4bedf909ef..7945ca077e 100644
--- i/pkg/clusterversion/cockroach_versions.go
+++ w/pkg/clusterversion/cockroach_versions.go
@@ -512,7 +512,7 @@ var versionsSingleton = keyedVersions{
 	},
 	{
 		Key:     RowLevelTTL,
-		Version: roachpb.Version{Major: 21, Minor: 2, Internal: 88},
+		Version: roachpb.Version{Major: 21, Minor: 2, Internal: 92},
 	},
 
 	// *************************************************

Spin up a cockroach shell and run the following:

> SELECT *, to_hex(value::bytes) as hex_value FROM system.settings LIMIT 5;
ERROR: bytea encoded value ends with escape character

This is the query we use to populate debug zips:

"system.settings": "SELECT *, to_hex(value::bytes) as hex_value FROM system.settings",

Expected behavior

For the above not to fail. Because it does, debug zips will fail to capture system.settings data.

+cc @ajwerner who touched this last in #70498. I'm not sure who a better owner is. Server?

Jira issue: CRDB-13662

@irfansharif irfansharif added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 9, 2022
@blathers-crl blathers-crl bot added the T-server-and-security DB Server & Security label Mar 9, 2022
@irfansharif irfansharif added the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 9, 2022
@blathers-crl
Copy link

blathers-crl bot commented Mar 9, 2022

Hi @irfansharif, please add branch-* labels to identify which branch(es) this release-blocker affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@irfansharif irfansharif added the branch-master Failures and bugs on the master branch. label Mar 9, 2022
@knz
Copy link
Contributor

knz commented Mar 10, 2022

When someone works on this, I'll gladly review the PR.

@mwang1026 mwang1026 removed the T-server-and-security DB Server & Security label Mar 10, 2022
@ajwerner
Copy link
Contributor

The root cause here is interesting, and sort of terrible: we are storing bytes data in a string column in this system table.

@knz
Copy link
Contributor

knz commented Mar 10, 2022

@ajwerner so this issue is not really obs inf then, it's more of a SQL / shared systems thing?

@ajwerner
Copy link
Contributor

I think there's two facets here. We're using a STRING column but storing arbitrary bytes in it, that seems to cause problems when casting that field to bytea. We probably want a short-term fix to find a way to directly encode the underlying bytes to hex. I don't know how to represent that in SQL.

  1. Decide whether we want to do something about the fact that we're storing raw bytes in a STRING column, which should be a shared systems thing
  2. Work around the fact that we're doing this to make debug zip generation robust to the current state of the world.

I think the second issue does belong here.

@knz
Copy link
Contributor

knz commented Mar 10, 2022

We probably want a short-term fix to find a way to directly encode the underlying bytes to hex. I don't know how to represent that in SQL

Ok, that seems hard. I can see two different ways:

  • don't do it server-side via to_hex, instead do it client-side in the zip code.
  • cast the string to JSONB, then hex-encode the entire JSONB value

@knz
Copy link
Contributor

knz commented Mar 10, 2022

cc @rimadeodhar for triage

@ajwerner
Copy link
Contributor

I could see us adding a new builtin that just interprets a STRING as bytes directly.

@knz
Copy link
Contributor

knz commented Mar 10, 2022

I'd like a solution that's guaranteed to work on previous-version clusters too

@dhartunian dhartunian self-assigned this Mar 10, 2022
@dhartunian
Copy link
Collaborator

to_hex can take a string argument. Is there a reason we can't do that here?

root@:26257/defaultdb> SELECT *, to_hex(value::bytes) as hex_value FROM system.settings;
              name              |                value                 |        lastUpdated         | valueType |                                hex_value
--------------------------------+--------------------------------------+----------------------------+-----------+---------------------------------------------------------------------------
  admission.kv.enabled          | true                                 | 2022-02-28 15:22:45.659777 | b         | 74727565
  cluster.secret                | 51358315-f723-47c8-bc28-b1e0da6254ff | 2022-02-03 20:18:51.936263 | s         | 35313335383331352d663732332d343763382d626332382d623165306461363235346666
  diagnostics.reporting.enabled | true                                 | 2022-02-03 20:18:51.772724 | b         | 74727565
(3 rows)
(error encountered after some results were delivered)
ERROR: bytea encoded value ends with escape character
SQLSTATE: 22025

root@:26257/defaultdb> SELECT *, to_hex(value) as hex_value FROM system.settings;
              name              |                value                 |        lastUpdated         | valueType |                                hex_value
--------------------------------+--------------------------------------+----------------------------+-----------+---------------------------------------------------------------------------
  admission.kv.enabled          | true                                 | 2022-02-28 15:22:45.659777 | b         | 74727565
  cluster.secret                | 51358315-f723-47c8-bc28-b1e0da6254ff | 2022-02-03 20:18:51.936263 | s         | 35313335383331352d663732332d343763382d626332382d623165306461363235346666
  diagnostics.reporting.enabled | true                                 | 2022-02-03 20:18:51.772724 | b         | 74727565
  version                       | \x12\b\b\x15\x10\x02\x18\x00 \\      | 2022-03-10 19:51:48.770627 | m         | 1208081510021800205c
(4 rows)


Time: 1ms total (execution 1ms / network 0ms)


@ajwerner
Copy link
Contributor

🙌 that's the ticket!

craig bot pushed a commit that referenced this issue Mar 11, 2022
77244: server: add HSTS support via cluster setting r=knz a=dhartunian

This commit adds a boolean cluster setting:
`server.hsts.enabled`
which, when enabled attaches standard HSTS headers to http requests
originating from CRDB nodes. The headers looke like this:
`Strict-Transport-Security: max-age=31536000`

When this header is present, most web browsers will automatically
upgrade all HTTP connections to HTTPS and *remember* that setting until
the expiry of 1 year defined in the header.

*Important*: Careless enabling of this feature can result in broken
access to the DB Console in the browser. If we instruct the browser to
always use HTTPS without having a valid TLS configuration, the browser
will no longer fallback to HTTP until the HSTS setting is manually
cleared.

Resolves #77224

Release justification: low-risk update; opt-in security enhancement

Release note (security update, ops change, ui change): users can enable
HSTS headers to be set on all HTTP requests which force browsers to
upgrade to HTTPS without a redirect. This is controlled by setting the
`server.hsts.enabled` cluster setting to true which is false by default.

77637: cli: debug zip treats version column as string r=ajwerner,irfansharif a=dhartunian

Previously, we converted the version column in `system.settings` to
hexadecimal by converting it to bytes first. This led to a problem once
the version got higher than a certain number.

Resolves #77579

Release justification: low risk bug fix to debug zip generation
Release note: None

77643: teamcity-trigger: fix up timeout argument r=rail a=rickystewart

The `TESTTIMEOUT` argument here is an artifact from the original version
of this job, which ran tests with `-test.timeout $TESTTIMEOUT`. In
`go test` world this configures the timeout *per package*. However, for
Bazel we want the `--test_timeout` to be higher than the `maxtime` that
we give to `stress`. So here we just add an extra minute to the
`maxtime` (which is the same thing we do in `dev`).

Without this change, longer tests can time out at the Bazel level before
they would time out at the `stress` level (see #77120).

Release justification: Fix `stress` nightly job
Release note: None

Co-authored-by: David Hartunian <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
@craig craig bot closed this as completed in 4a61a1b Mar 11, 2022
@craig craig bot closed this as completed in #77637 Mar 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants