Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Persist schema version at time of commit #1055

Merged
merged 9 commits into from
Jan 26, 2023

Conversation

AndrewSisley
Copy link
Contributor

Relevant issue(s)

Resolves #1023

Description

Persists schema version at time of commit (property SchemaVersionKey on CompositeDAGDelta) .

The driving change is within the commit Replace commit schemaId with versionKey. The commits prior to that are just preliminary cleanup. Please read the commit body of that commit as it will help explain why some other stuff had to change alongside it.

The important part of this PR is not integration testable at the moment (SchemaVersionKey on CompositeDAGDelta) as it is not read. This will change very soon as part of #1006 - I will work on that immediately after opening this PR, let me know if you prefer to review this as part of #1006 instead of in its current isolation.

@AndrewSisley AndrewSisley added feature New feature or request area/schema Related to the schema system area/db-system Related to the core system related components of the DB action/no-benchmark Skips the action that runs the benchmark. labels Jan 24, 2023
@AndrewSisley AndrewSisley added this to the DefraDB v0.5 milestone Jan 24, 2023
@AndrewSisley AndrewSisley requested a review from a team January 24, 2023 21:39
@AndrewSisley AndrewSisley self-assigned this Jan 24, 2023
@codecov
Copy link

codecov bot commented Jan 24, 2023

Codecov Report

Merging #1055 (3904bf3) into develop (93ea57a) will decrease coverage by 0.02%.
The diff coverage is 93.61%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1055      +/-   ##
===========================================
- Coverage    68.10%   68.09%   -0.02%     
===========================================
  Files          172      172              
  Lines        16250    16246       -4     
===========================================
- Hits         11067    11062       -5     
- Misses        4251     4253       +2     
+ Partials       932      931       -1     
Impacted Files Coverage Δ
client/descriptions.go 84.61% <ø> (ø)
db/collection.go 66.05% <84.21%> (+0.18%) ⬆️
core/crdt/composite.go 75.34% <100.00%> (+2.00%) ⬆️
db/fetcher/versioned.go 57.52% <100.00%> (+1.41%) ⬆️
merkle/crdt/composite.go 80.00% <100.00%> (ø)
merkle/crdt/factory.go 85.71% <100.00%> (ø)
merkle/crdt/lwwreg.go 83.33% <100.00%> (ø)
net/process.go 78.57% <100.00%> (+0.95%) ⬆️
net/client.go 78.04% <0.00%> (-7.32%) ⬇️
net/peer.go 48.32% <0.00%> (-2.16%) ⬇️
... and 1 more

@AndrewSisley AndrewSisley force-pushed the sisley/feat/I1006-schema-version-commit-q branch 2 times, most recently from 90db176 to 4a96aaa Compare January 24, 2023 21:57
Name string
// SchemaId is the version agnostic identifier for this schema.
//
// It remains constant throught the lifetime of this schema.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: typo throught

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) Cheers

  • typo

@@ -100,7 +100,17 @@ func (index IndexDescription) IDString() string {

// SchemaDescription describes a Schema and its associated metadata.
type SchemaDescription struct {
Name string
// SchemaId is the version agnostic identifier for this schema.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SchemaId is the version agnostic identifier for this schema.
// SchemaID is the version agnostic identifier for this schema.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤮 Will change, I think I find remembering this Go preference even harder than using US/Canadian English lol

  • ID

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol... Of course this means updating all the test CIDs again...

  • Update test cids

// SchemaId is the version agnostic identifier for this schema.
//
// It remains constant throught the lifetime of this schema.
SchemaId string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SchemaId string
SchemaID string

Comment on lines 108 to 113
// VersionId is the version-specific identifier for this schema.
//
// It is generated on mutation of this schema and can be used to uniquely
// identify a schema at a specific version.
VersionId string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// VersionId is the version-specific identifier for this schema.
//
// It is generated on mutation of this schema and can be used to uniquely
// identify a schema at a specific version.
VersionId string
// VersionID is the version-specific identifier for this schema.
//
// It is generated on mutation of this schema and can be used to uniquely
// identify a schema at a specific version.
VersionID string

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Just a couple questions and Shahzad's suggestions and then I can approve :)

core/crdt/composite.go Outdated Show resolved Hide resolved
Copy link
Member

@jsimnz jsimnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions based on the inclusion of core.CollectionSchemaVersionKey as a type, as well as the use of the Key suffix on fields.

// SchemaVersionKey is the schema version datastore key at the time of commit.
//
// It can be used to identify the collection datastructure state at time of commit.
SchemaVersionKey string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why use the CollectionSchemaVersionKey which when serialized into a string has the /collection/version prefix. That shouldn't be persisted into the DAG.

Should only be the ID (CID) part thats persisted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Key isnt a needed suffix here on the field name. My understanding based on the discord conversation is that its used to refer to literal kvstore keys, but the CompositeDAGDelta shouldn't have kv information, as its all global data that is persisted into the DAG. Simply SchemaVersion or SchemaVersionID is sufficient.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) Makes a lot of sense - will change so that just the cid is stored. I probably ToStringed it on autopilot, it protects against the addition of any new props, but is quite wasteful here.

  • only persist cid

db/collection.go Outdated
@@ -159,6 +154,16 @@ func (db *db) CreateCollection(

// For new schemas the initial version id will match the schema id
schemaVersionId := schemaId

col.desc.Schema.VersionId = schemaVersionId
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question(out side of scope): Once migrations actually get added, is the schemaVersionID going to be the CID of the full schema after field changes, or is it going to be the CID of the patch difference?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full schema, unless that changes for some unforeseen reason.

I dont think that changing should affect the usage here though, and public function that fetches a schema by version id should be pretty explicit as to whether it returns full or patch, and it should be capable of constructing the full from patch in the off chance that it would need to do so on the fly.

Will be added to shortly
Breaks some tests as although the property is always empty it is accounted for in CID generation.
Note: The empty unused prop was used to construct some cids, removing it thus changes the cids even though it was always empty.
The primary driving change in this commit is the addition of SchemaVersionKey to CompositeDAGDelta in core/crdt/composite.go, this persists the schema state alongside the document change.  All other changes in this commit are to facilitate this change.

This commit got a bit more involved than I intended as getCollectionByVersionId was regenerating the schema id instead of taking it from the object it already had - possibly because the property on the object that it already had never actualy set the property.  This got a bit awkward as the id lives on the object it is generated from - when the prop was populated on regeneration of itself it was taken into account as part of the regenerated key causing it to not match the originally (saved) key.

Cids in the tests have been corrected in this commit for the production code changes in this commit and the prior BREAKING commits (to save fixing them 3 times).
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I1006-schema-version-commit-q branch from 4a96aaa to 5125e79 Compare January 26, 2023 18:43
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I1006-schema-version-commit-q branch from 2e8e98b to ce15526 Compare January 26, 2023 18:53
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I1006-schema-version-commit-q branch 4 times, most recently from b1fba09 to 6955ad2 Compare January 26, 2023 19:38
@jsimnz jsimnz self-requested a review January 26, 2023 19:40
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I1006-schema-version-commit-q branch from 6955ad2 to 83ef435 Compare January 26, 2023 19:45
@AndrewSisley AndrewSisley force-pushed the sisley/feat/I1006-schema-version-commit-q branch 5 times, most recently from a1b6368 to 3904bf3 Compare January 26, 2023 19:55
@AndrewSisley AndrewSisley merged commit 825e749 into develop Jan 26, 2023
@AndrewSisley AndrewSisley deleted the sisley/feat/I1006-schema-version-commit-q branch January 26, 2023 20:39
shahzadlone pushed a commit that referenced this pull request Apr 13, 2023
* Remove unused dag.GetSchemaID func

* Break up long lines

Will be added to shortly

* BREAKING - Remove unused SchemaId prop

Breaks some tests as although the property is always empty it is accounted for in CID generation.

* BREAKING - Remove unused Schema.Key prop

Note: The empty unused prop was used to construct some cids, removing it thus changes the cids even though it was always empty.

* Replace commit schemaId with versionKey

The primary driving change in this commit is the addition of SchemaVersionKey to CompositeDAGDelta in core/crdt/composite.go, this persists the schema state alongside the document change.  All other changes in this commit are to facilitate this change.

This commit got a bit more involved than I intended as getCollectionByVersionId was regenerating the schema id instead of taking it from the object it already had - possibly because the property on the object that it already had never actualy set the property.  This got a bit awkward as the id lives on the object it is generated from - when the prop was populated on regeneration of itself it was taken into account as part of the regenerated key causing it to not match the originally (saved) key.

Cids in the tests have been corrected in this commit for the production code changes in this commit and the prior BREAKING commits (to save fixing them 3 times).
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
* Remove unused dag.GetSchemaID func

* Break up long lines

Will be added to shortly

* BREAKING - Remove unused SchemaId prop

Breaks some tests as although the property is always empty it is accounted for in CID generation.

* BREAKING - Remove unused Schema.Key prop

Note: The empty unused prop was used to construct some cids, removing it thus changes the cids even though it was always empty.

* Replace commit schemaId with versionKey

The primary driving change in this commit is the addition of SchemaVersionKey to CompositeDAGDelta in core/crdt/composite.go, this persists the schema state alongside the document change.  All other changes in this commit are to facilitate this change.

This commit got a bit more involved than I intended as getCollectionByVersionId was regenerating the schema id instead of taking it from the object it already had - possibly because the property on the object that it already had never actualy set the property.  This got a bit awkward as the id lives on the object it is generated from - when the prop was populated on regeneration of itself it was taken into account as part of the regenerated key causing it to not match the originally (saved) key.

Cids in the tests have been corrected in this commit for the production code changes in this commit and the prior BREAKING commits (to save fixing them 3 times).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/db-system Related to the core system related components of the DB area/schema Related to the schema system feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Couple commit to schema version
4 participants