-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add schema support for vector indexing in CREATE TABLE #140090
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! We probably want Foundations to take a look as well before merging, but
Reviewed 38 of 38 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball and @rafiss)
pkg/sql/catalog/catformat/index.go
line 263 at r1 (raw file):
// direction. Since the default direction is ASC, we omit the direction // entirely for inverted/vector index columns. if i < n-1 || index.Type.AllowExplicitDirection() {
Maybe this method should specify that it has to do with the last column only. LastKeyColIsOrdered
maybe?
pkg/sql/sem/idxtype/idxtype.go
line 67 at r1 (raw file):
// Panics if this is called for a forward index, as "forward" is not a term we // should be including in error messages, as most users will not understand it. func ErrorText(t T) string {
Let's make this return redact.SafeString
so that it doesn't get redacted.
a38f721
to
fcf2fbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @rafiss)
pkg/sql/catalog/catformat/index.go
line 263 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Maybe this method should specify that it has to do with the last column only.
LastKeyColIsOrdered
maybe?
I changed it to be HasLinearOrdering
:
// HasLinearOrdering is true if this index type does not define a linear
// ordering on the last key column in the index. For example, a vector index
// groups nearby vectors, but does not define a linear ordering among them. As
// another example, an inverted index only defines a linear ordering for tokens,
// not for the original JSONB or ARRAY data type.
pkg/sql/sem/idxtype/idxtype.go
line 67 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Let's make this return
redact.SafeString
so that it doesn't get redacted.
Done.
fcf2fbf
to
7eb3def
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! my comments are pretty minor
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball and @DrewKimball)
pkg/sql/sem/idxtype/idxtype.go
line 70 at r3 (raw file):
// // Panics if this is called for a forward index, as "forward" is not a term we // should be including in error messages, as most users will not understand it.
super nit: how did we determine this should be a function of the package, rather than a method with a T
receiver?
pkg/sql/sem/idxtype/idxtype.go
line 78 at r3 (raw file):
return "a vector index" default: panic(errors.AssertionFailedf("ErrorText is not defined for index type %v", t))
is a panic
that might crash the node the best we can do here? what if we had this return something like redact.Sprintf("an unknown index type (%v)", ...)
i noticed that some the other switch statements in this PR did not panic in the default case.
pkg/sql/catalog/tabledesc/structured.go
line 1098 at r3 (raw file):
// cannot be indexed. var typInfo string msg := "the following columns are not indexable due to their type: "
let's craft the message using redact.StringBuilder
, otherwise the entire message will get redacted.
pkg/sql/catalog/catformat/index.go
line 252 at r3 (raw file):
} // TODO(drewk): we might need to print something like "vector_l2_ops" for // vector indexes.
out of curiosity; how can we determine if we need this?
7eb3def
to
c718280
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @rafiss)
pkg/sql/catalog/catformat/index.go
line 252 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
out of curiosity; how can we determine if we need this?
I think we'll know more when we add support for other distance functions like cosine and inner product.
pkg/sql/catalog/tabledesc/structured.go
line 1098 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
let's craft the message using
redact.StringBuilder
, otherwise the entire message will get redacted.
I just eliminated this code entirely. Yes, the error message is a bit better for this case, but it's not being done the same way in the schema changer code, so it was causing failures when I tried to unify the error checking in colinfo.ValidateColumnForIndex
.
pkg/sql/sem/idxtype/idxtype.go
line 70 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
super nit: how did we determine this should be a function of the package, rather than a method with a
T
receiver?
I feel like this is a special-case utility function rather than a core part of what an index type is.
pkg/sql/sem/idxtype/idxtype.go
line 78 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
is a
panic
that might crash the node the best we can do here? what if we had this return something likeredact.Sprintf("an unknown index type (%v)", ...)
i noticed that some the other switch statements in this PR did not panic in the default case.
I just replaced it with the string "an index" - this should never get called anyways. As an aside, what's the right way to get redact.SafeString
from redact.RedactableString
, or were you suggesting here that we return redact.RedactableString
?
d661c24
to
6c0a17d
Compare
Support usage of VECTOR INDEX in a CREATE TABLE statement, e.g.: CREATE TABLE simple ( a INT PRIMARY KEY, vec1 VECTOR(3), VECTOR INDEX (vec1) ) Create the corresponding table and index schema objects for the vector index. Check various error conditions, e.g. that only a column of type VECTOR can be the last column in the index. Add unit and logic tests. CREATE VECTOR INDEX support will come in a future PR. Epic: CRDB-42943 Release note: None Co-authored-by: Drew Kimball <[email protected]>
6c0a17d
to
d4c15ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball and @DrewKimball)
pkg/sql/sem/idxtype/idxtype.go
line 78 at r3 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
I just replaced it with the string "an index" - this should never get called anyways. As an aside, what's the right way to get
redact.SafeString
fromredact.RedactableString
, or were you suggesting here that we returnredact.RedactableString
?
yeah, i was proposing the usage of RedactableString
. though i guess under my suggestion, the format arg is safe to, so would have been safe to cast the whole thing to SafeString
.
bors r+ |
TFTR, @rafiss, @DrewKimball |
Support usage of VECTOR INDEX in a CREATE TABLE statement, e.g.:
CREATE TABLE simple (
a INT PRIMARY KEY,
vec1 VECTOR(3),
VECTOR INDEX (vec1)
)
Create the corresponding table and index schema objects for the vector index. Check various error conditions, e.g. that only a column of type VECTOR can be the last column in the index. Add unit and logic tests. CREATE VECTOR INDEX support will come in a future PR.
Epic: CRDB-42943
Release note: None