-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic types 2 effective type #15022
Conversation
```clojure user> (def config (metabase.db.spec/h2 {:db (str "/Users/dan/projects/clojure/metabase/resources/sample-dataset.db" ";UNDO_LOG=0;CACHE_SIZE=131072;QUERY_CACHE_SIZE=128;COMPRESS=TRUE;" "MULTI_THREADED=TRUE;MVCC=TRUE;DEFRAG_ALWAYS=TRUE;MAX_COMPACT_TIME=5000;" "ANALYZE_AUTO=100")})) user> (jdbc/execute! config ["UPDATE _metabase_metadata SET keypath = 'PEOPLE.ZIP.semantic_type' WHERE keypath = 'PEOPLE.ZIP.special_type'" ]) [1] user> (jdbc/execute! config ["UPDATE _metabase_metadata SET keypath = 'REVIEWS.BODY.semantic_type' WHERE keypath = 'REVIEWS.BODY.special_type'" ]) [1] ```
locally i have some failing tests that are off by 1 errors: Fail in compile-time-interval-test �[36m:mongo�[0m Make sure time-intervals work the way they're supposed to. [:time-interval $date -4 :month] should give us something like Oct 01 2020 - Feb 01 2021 if today is Feb 17 2021 expected: [{$match {$and [{:$expr {$gte [$date {:$dateFromString {:dateString 2020-10-01T00:00Z}}]}} {:$expr {$lt [$date {:$dateFromString {:dateString 2021-02-01T00:00Z}}]}}]}} {$group {_id {date~~~day {:$let {:vars {:parts {:$dateToParts {:date $date}}}, :in {:$dateFromParts {:year $$parts.year, :month $$parts.month, :day $$parts.day}}}}}}} {$sort {_id 1}} {$project {_id false, date~~~day $_id.date~~~day}} {$sort {date~~~day 1}} {$limit 1048576}] actual: [{"$match" {"$and" [{:$expr {"$gte" ["$date" {:$dateFromString {:dateString "2020-11-01T00:00Z"}}]}} {:$expr {"$lt" ["$date" {:$dateFromString {:dateString "2021-03-01T00:00Z"}}]}}]}} {"$group" {"_id" {"date~~~day" {:$let {:vars {:parts {:$dateToParts {:date "$date"}}}, :in {:$dateFromParts {:year "$$parts.year", :month "$$parts.month", :day "$$parts.day"}}}}}}} {"$sort" {"_id" 1}} {"$project" {"_id" false, "date~~~day" "$_id.date~~~day"}} {"$sort" {"date~~~day" 1}} {"$limit" 1048576}] diff: - [{"$match" {"$and" [{:$expr {"$gte" [nil {:$dateFromString {:dateString "2020-10-01T00:00Z"}}]}} {:$expr {"$lt" [nil {:$dateFromString {:dateString "2021-02-01T00:00Z"}}]}}]}}] + [{"$match" {"$and" [{:$expr {"$gte" [nil {:$dateFromString {:dateString "2020-11-01T00:00Z"}}]}} {:$expr {"$lt" [nil {:$dateFromString {:dateString "2021-03-01T00:00Z"}}]}}]}}]
Codecov Report
@@ Coverage Diff @@
## master #15022 +/- ##
==========================================
- Coverage 84.30% 84.28% -0.02%
==========================================
Files 389 389
Lines 30660 30704 +44
Branches 2198 2202 +4
==========================================
+ Hits 25848 25880 +32
- Misses 2614 2622 +8
- Partials 2198 2202 +4
Continue to review full report at Codecov.
|
src/metabase/sync/interface.clj
Outdated
(s/optional-key :field-comment) (s/maybe su/NonBlankString) | ||
(s/optional-key :pk?) s/Bool | ||
(s/optional-key :nested-fields) #{(s/recursive #'TableMetadataField)} | ||
(s/optional-key :custom) {s/Any s/Any}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought: we should probably make this schema open with s/Keyword s/Any
so new keys we add in the future don't break things. e.g. if a driver starts returning :effective-type
it wouldn't work on Metabase < 0.39.0. Not that changing the schema in 0.39.0 would make any difference, but in the future it would be easier for driver authors to take advantage of new features and still support older MB versions
(defn- to-field [{:keys [base_type] :as field}] | ||
(map->FieldInstance | ||
(merge {:coercion_strategy nil, :effective_type base_type, :semantic_type nil} | ||
field))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use mt/derecordize
instead to make FieldInstances regular maps for test purposes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also mt/object-defaults
that add default values for keys for an object, so we don't have to repeat things like this all over the code or go update a bunch of tests when we add new columns. I'd use that instead
(s/optional-key :coercion-strategy) (s/maybe su/CoercionStrategy) | ||
(s/optional-key :visibility-type) (s/maybe (apply s/enum field/visibility-types)) | ||
(s/optional-key :fk) (s/maybe su/KeywordOrString) | ||
(s/optional-key :field-comment) (s/maybe su/NonBlankString)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non really part of this PR, but it wouldn't be a bad idea to open this schema up either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few notes but overall LGTM
it actually has knock on effects: - does more work now as almost every field has an update to do in `add-extra-metadata` - we have databases that have state that we don't create. druid for example has stuff to mimic the dataset in tqpt/with-flattened-dbdef on checkins but we don't actually create this. And our dbdef has a field called "date" that is not present in the druid db, so if we attempt to add metadata it fails and kills the rest of the metadata that we add. - tests need this metadata to be present and the error causes field visibilities (for example) to not be set
Lol i think those tests failed because of the time change last night. |
object details didn't work out well here. they added way more stuff from the db than what is flowing through here. ```clojure actual: {:field {:name "DATE", :parent_id nil, :table_id 69, :base_type :type/Date, :effective_type :type/Date, :coercion_strategy nil, :semantic_type nil}, :value {:type :date/all-options, :value "past5days"}} diff: - {:field {:description nil, :database_type "VARCHAR", :fingerprint_version 0, :has_field_values nil, :settings nil, :caveats nil, :fk_target_field_id nil, :custom_position 0, :active true, :last_analyzed nil, :position 1, :visibility_type :normal, :preview_display true, :database_position 0, :fingerprint nil, :points_of_interest nil}} ``` Object defaults adds quite a bit of stuff such that we'd be dissoc'ing more than we are currently adding in
add reliance on effective_type in the query processor. Things get a bit dicey with annotate and inferring types. sometimes we have a proper field definition and can say what the effective type and conversion strategy is. othertimes we just guess so effective type doesn't really make sense?
haven't gone through to excise the use of base_type. In the future, we should have only a few places that actually care about base_type when emitting otherwise everything should care about the effective_type.
UI Changes
To set the coercion strategy its moved underneath the gear cog on the data model for fields:
