Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors after upgrade to 22.1.1 #82576

Closed
mputau opened this issue Jun 8, 2022 · 9 comments
Closed

Errors after upgrade to 22.1.1 #82576

mputau opened this issue Jun 8, 2022 · 9 comments
Assignees
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-technical-advisory Caused a technical advisory O-community Originated from the community sync-me-7 sync-me-8 T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-blathers-triaged blathers was able to find an owner

Comments

@mputau
Copy link

mputau commented Jun 8, 2022

Describe the problem

I have upgraded my cockroach from 21.2.12 to 22.1.1. After that I have met an issue

SHOW SEQUENCES
ERROR: internal error: relation "requisition_reference_seq" (12605): depended-on-by relation "requisition" (12576) does not have a column with ID 0
SQLSTATE: XX000
DETAIL: stack trace:
github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/validate.go:311: validateInboundTableRef()
github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/validate.go:188: ValidateCrossReferences()
github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/validate/validate.go:77: func2()
github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/validate/validate.go:165: validateDescriptorsAtLevel()
github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/validate/validate.go:72: Validate()
github.com/cockroachdb/cockroach/pkg/sql/catalog/nstree/catalog.go:144: Validate()
github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/kv_descriptors.go:279: getAllDescriptors()
github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/collection.go:284: GetAllDescriptors()
github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:2385: forEachTableDescWithTableLookupInternal()
github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:2339: forEachTableDescWithTableLookup()
github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:2267: forEachTableDesc()
github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:1114: func156()
github.com/cockroachdb/cockroach/pkg/sql/virtual_schema.go:567: 1()
github.com/cockroachdb/cockroach/pkg/sql/virtual_table.go:127: func3()
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:494: func2(
GOROOT/src/runtime/asm_amd64.s:1581: goexit()

HINT: You have encountered an unexpected error.

Please check the public issue tracker to check whether this problem is
already tracked. If you cannot find it there, please report the error
with details by creating a new issue.

If you would rather not post publicly, please contact us directly
using the support form.

We appreciate your feedback.

To Reproduce

In database we have:

CREATE SEQUENCE  IF NOT EXISTS public.requisition_reference_seq MINVALUE 1
ALTER TABLE public.requisition ADD reference BIGINT DEFAULT nextval('requisition_reference_seq') NOT NULL

Expected behavior
No error messages

Environment:

  • CockroachDB version 22.1.1 (single node)
  • Server OS: Ubuntu (WSL)
  • Client app cockroach sql and JDBC

Jira issue: CRDB-16567

@mputau mputau added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 8, 2022
@blathers-crl
Copy link

blathers-crl bot commented Jun 8, 2022

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

  • @cockroachdb/sql-schema (found keywords: ALTER TABLE)

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Jun 8, 2022
@ajwerner
Copy link
Contributor

ajwerner commented Jun 8, 2022

Has this cluster been upgraded from many versions ago? We know of such a problem when creating this relation in 19.2 or earlier, but I don't think we know of such a corruption bug from later creations. Is this such a case?

@blathers-crl blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Jun 8, 2022
@ajwerner
Copy link
Contributor

ajwerner commented Jun 8, 2022

FWIW, we can happily help you to repair this situation. Sorry you've hit it!

@mputau
Copy link
Author

mputau commented Jun 9, 2022

Has this cluster been upgraded from many versions ago? We know of such a problem when creating this relation in 19.2 or earlier, but I don't think we know of such a corruption bug from later creations. Is this such a case?

I have performed upgrade from 21.1.3 -> 21.2.12 -> 22.1.1. Cluster - single node.
Many months ago we upgraded from 20.2 ->21.1.3. But this is our test env, so we often drop and recreate database and relations. So this relation and database was created in 21.1.3

@minhdang241
Copy link

minhdang241 commented Jun 13, 2022

Any update? I have the same issue when updating to version 22.1.1. I neither can rollback to the previous version due to this error cockroach version v21.2 is incompatible with data in store <no-attributes>=/cockroach/cockroach-data;

@ajwerner
Copy link
Contributor

Yeah, we'll provide a query to do a repair and patch the next 22.1 point release. That's as good as I think we can do. Sorry you hit this. Working on it now.

@ajwerner
Copy link
Contributor

Okay, here's the repair query for the case where you have exactly the error reported above. If in doubt, don't run it and ask for advice here. The query will find the corruption where we failed to properly record the column ID for the sequence reference. It should only be run on version 22.1.

  WITH tables AS (
                SELECT *
                  FROM (
                        SELECT id,
                               crdb_internal.pb_to_json(
                                'cockroach.sql.sqlbase.Descriptor',
                                descriptor
                               )->'table' AS tab
                          FROM system.descriptor
                       )
                 WHERE tab IS NOT NULL
              ),
       columns_using_sequence_ids AS (
                                    SELECT table_id,
                                           (c->'id')::INT8 AS column_id,
                                           json_array_elements(c->'usesSequenceIds')::INT8 AS seq_id
                                      FROM (
                                            SELECT id AS table_id, c
                                              FROM tables,
                                                   ROWS FROM (json_array_elements(tab->'columns')) AS t
                                                                                                    (c)
                                           )
                                     WHERE (c->'usesSequenceIds') IS NOT NULL
                                       AND json_array_length(c->'usesSequenceIds') = 1
                                  ),
       sequences_with_missing_depended_on_by AS (
                                                SELECT seq_id, (dep->>'id')::INT8 AS table_id, ord, dep
                                                  FROM (
                                                        SELECT id AS seq_id, dep, ord - 1 AS ord
                                                          FROM tables,
                                                               ROWS FROM (
                                                                json_array_elements(tab->'dependedOnBy')
                                                               ) WITH ORDINALITY AS t (dep, ord)
                                                         WHERE (tab->'sequenceOpts') IS NOT NULL
                                                           AND EXISTS(
                                                                SELECT *
                                                                  FROM ROWS FROM (
                                                                        json_array_elements(
                                                                            tab->'dependedOnBy'
                                                                        )
                                                                       ) AS t (dep)
                                                                 WHERE dep->'columnIds' @> '[0]'::JSONB
                                                               )
                                                       )
                                                 WHERE (dep->>'byId')::BOOL
                                             ),
       depended_on_by_entries AS (
                                  SELECT s.seq_id, ord, json_agg(t.column_id) AS column_ids
                                    FROM columns_using_sequence_ids AS t
                                    JOIN sequences_with_missing_depended_on_by AS s ON t.table_id
                                                                                       = s.table_id
                                                                                   AND t.seq_id
                                                                                       = s.seq_id
                                GROUP BY s.seq_id, s.table_id, ord
                              ),
       updated_entries AS (
                        SELECT seq_id, ord, json_set(d, ARRAY['columnIds'], column_ids) AS d
                          FROM depended_on_by_entries
                          JOIN tables ON seq_id = tables.id
                          JOIN ROWS FROM (
                                json_array_elements(tab->'dependedOnBy')
                               ) WITH ORDINALITY AS jae (d, idx) ON ord = idx - 1
                       ),
       depended_on_by_arrs AS (
                              SELECT seq_id, json_agg(d ORDER BY ord ASC) AS depended_on_by
                                FROM updated_entries
                            GROUP BY seq_id
                           )
SELECT crdb_internal.unsafe_upsert_descriptor(
        seq_id,
        crdb_internal.json_to_pb(
            'cockroach.sql.sqlbase.Descriptor',
            json_build_object(
                'table',
                json_remove_path(
                    json_set(
                        json_set(tab, ARRAY['dependedOnBy'], depended_on_by),
                        ARRAY['version'],
                        ((tab->>'version')::INT8 + 1)::STRING::JSONB
                    ),
                    ARRAY['modificationTime']
                )
            )
        ),
        true
       )
  FROM depended_on_by_arrs JOIN tables ON id = seq_id;

@ajwerner
Copy link
Contributor

@minhdang241 your issue is not the same as this one. We communicated in the community slack regarding your issue. I'll do something for that too.

craig bot pushed a commit that referenced this issue Jun 13, 2022
82724: spanconfig: reset job run_stats to avoid job system backoff r=[irfansharif,adityamaru] a=stevendanna

If the coordinator of the span configuration job dies, another node
will adopt the job. However, when doing so it will bump the num_runs
run stat. As this number increases, the job system will delay future
resumptions of this job.

We solve this here by resetting the job's run_stats at the beginning
of the job.

We've yet again handled this in the job directly rather than adjusting
the behavior of the job system. In this case, my justification is that
this solution is fit for backporting.

Fixes #82689

Release note (bug fix): Fix a bug where the startup of an internal
component after a server restart could result in the delayed
application of zone configuration.

82814: backupccl: small error message nitpicks r=adityamaru a=stevendanna

I noticed these when reading through the RESTORE code.

Release note: None

82833: sql/catalog/tabledesc: permit zero-valued column IDs in DependedOnBy r=ajwerner a=ajwerner

In 21.1 there was a bug whereby we would store a 0-value column ID in the
sequence's DependedOnBy because we'd add the depedency before allocating
an ID to the column. This bug was fixed in 21.2. Below is a reproduction
I've used to play with this bug:

```bash
roachprod wipe local
roachprod stage local release v21.1.3
roachprod run local -- mv cockroach cockroach-v21.1.3
roachprod stage local release v21.2.10
roachprod run local -- mv cockroach cockroach-v21.2.10
roachprod stage local release v22.1.1
roachprod run local -- mv cockroach cockroach-v22.1.1
roachprod start local --binary cockroach-v21.1.3
roachprod sql local -- -e "create table t1( i int primary key);
                           create table t2(i int primary key);
                           create sequence s1;
                           create sequence s2;
                           ALTER TABLE t1 ADD c1 BIGINT DEFAULT nextval('s1') NOT NULL;
                           ALTER TABLE t1 ADD c2 BIGINT DEFAULT nextval('s2') NOT NULL;
                           ALTER TABLE t2 ADD c1 BIGINT DEFAULT nextval('s1') NOT NULL;"
roachprod stop local
roachprod start local --binary cockroach-v21.2.10
while ! { roachprod sql local -- -e 'show cluster setting version' | grep 21.2 ; }; do sleep 1; done

roachprod sql local -- -e "alter table t1 add column c3 int default nextval('s1');
                           create table t3 (i int primary key);
                           alter table t3 add column c1 int default nextval('s1');"

roachprod stop local
roachprod start local --binary cockroach-v22.1.1
while ! { roachprod sql local -- -e 'show cluster setting version' | grep 22.1 ; }; do sleep 1; done
```

For now, for the rest of 22.1 we'll let this slide. As follow-up work, we'll
perform a migration to repair this situation and add back this validation once
the migration has been performed.

Relates to #82576.

Release note (bug fix): In earlier 22.1 releases of cockroach, added validation
could cause problems for descriptors which carried invalid backreferences due
to an earlier bug in 21.1. This stricter validation could result in a variety
of query failures. This patch weakens the validation to permit the corruption
as we know it. A subsequent patch in 22.2 will be created to repair the invalid
reference.

Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
@ajwerner
Copy link
Contributor

Fixed by #82859.

Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 16, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 18, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 18, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 22, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 22, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 23, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 23, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 24, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
Xiang-Gu added a commit to Xiang-Gu/cockroach that referenced this issue Aug 26, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 26, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 26, 2022
This upgrade attempts to update invalid column IDs in sequence back
references, if any, on a best-effort basis.

The context of the need for such an upgrade can be found in cockroachdb#82576.
The summary there is bugs in prior versions might cause sequence
descriptor corruption where their back references might contain a column
ID 0. We ought to figure out what the actual column ID is and update
such invalid column IDs.
@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
@rytaft rytaft added C-technical-advisory Caused a technical advisory branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 labels Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-technical-advisory Caused a technical advisory O-community Originated from the community sync-me-7 sync-me-8 T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

No branches or pull requests

6 participants