Skip to content

Commit

Permalink
Use Marginalia to centralize documentation (metabase#34608)
Browse files Browse the repository at this point in the history
  • Loading branch information
tsmacdonald authored Oct 14, 2023
1 parent bd8fe4a commit 8f2a82e
Show file tree
Hide file tree
Showing 7 changed files with 203 additions and 4 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@
/.lein-repl-history
/.sqlite.db
/backend-checksums.txt
/backend-docs/uberdoc.html
/build.xml
/checkouts
/classes
/coverage
/crate-*
/cypress
/deploy/artifacts/*
/docs/uberdoc.html
/frontend/test/snapshots/*
/e2e/snapshots/*
/lein-plugins/*/target
Expand Down
13 changes: 11 additions & 2 deletions deps.edn
Original file line number Diff line number Diff line change
Expand Up @@ -283,8 +283,8 @@
{:extra-paths ["enterprise/backend/src"]}

;; Include EE tests.
;; for ee dev: :dev:ee:ee-dev
;; for ee tests: clojure -X:dev:ee:ee-dev:test
;; for EE dev: `clojure -X:dev:ee:ee-dev`
;; for EE tests: `clojure -X:dev:ee:ee-dev:test`
:ee-dev
{:extra-paths ["enterprise/backend/test"]}

Expand All @@ -295,6 +295,15 @@
:oss-dev
{}

;; Generate BE documentation with
;; clojure -M:marginalia
:marginalia
{:extra-deps
{com.github.tsmacdonald/marginalia {:mvn/version "0.9.2"}}
:main-opts ["-m" "marginalia.main" "-n" "Metabase" "-d" "backend-docs" "-D"
"The simplest, fastest way to get business intelligence and analytics to everyone in your company 😋"
"dev" "src" "shared/src" "enterprise/backend/src"]}

;; Find outdated versions of dependencies. Run with `clojure -M:outdated`
:outdated {;; Note that it is `:deps`, not `:extra-deps`
:deps {com.github.liquidz/antq {:mvn/version "RELEASE"}}
Expand Down
34 changes: 34 additions & 0 deletions dev/src/dev.clj
Original file line number Diff line number Diff line change
@@ -1,3 +1,37 @@
;; # Metabase Backend Developer Documentation
;;
;; Welcome to Metabase! Here are links to useful resources.
;;
;; ## Project Management
;;
;; - [Engineering and Product Playbook](https://www.notion.so/metabase/Engineering-and-Product-Playbook-cd4bc1c0b8744470bebc0b979f8f5268)
;; - [Weekly Tactical Board: how to](https://www.notion.so/metabase/Weekly-Tactical-Board-how-to-6e81f994a792493ba7ae430f2afa1673)
;; - [The Escalations Process](https://www.notion.so/Escalating-a-bug-b876f78c801345f3bda8504d4a63ba80)
;;
;; ## Dev Environment
;;
;; - [Getting started with backend development](https://github.com/metabase/metabase/blob/master/docs/developers-guide/devenv.md#backend-development)
;; - [Additional notes on using tools.deps](https://github.com/metabase/metabase/wiki/Migrating-from-Leiningen-to-tools.deps)
;; - [Other tips](https://github.com/metabase/metabase/wiki/Metabase-Backend-Dev-Secrets)
;;
;; ## Important Parts of the Codebase
;;
;; - [API Endpoints](file:///home/tmacdonald/src/metabase/backend-docs/uberdoc.html#metabase.api.common)
;; - [Drivers](#metabase.driver)
;; - [Permissions](#metabase.models.permissions)
;; - [The Query Processor](#metabase.query-processor)
;; - [Application Settings](#metabase.models.setting)
;;
;; ## Important Libraries
;;
;; - [Toucan 2](https://github.com/camsaul/toucan2/) to work with models
;; - [Honey SQL](https://github.com/seancorfield/honeysql) (version 2) for SQL queries
;; - [Liquibase](https://docs.liquibase.com/concepts/changelogs/changeset.html) for database migrations
;; - [Compojure](https://github.com/weavejester/compojure) on top of [Ring](https://github.com/ring-clojure/ring) for our API
;;
;; <hr />


(ns dev
"Put everything needed for REPL development within easy reach"
(:require
Expand Down
7 changes: 7 additions & 0 deletions enterprise/backend/src/metabase_enterprise/core.clj
Original file line number Diff line number Diff line change
@@ -1,2 +1,9 @@
;; Unless otherwise noted, all files © 2023 Metabase, Inc.
;;
;; Source code in this repository is variously licensed under the GNU Affero General Public License (AGPL), or the
;; [Metabase Commercial License](https://www.metabase.com/license/commercial).
;;
;; <hr />

(ns metabase-enterprise.core
"Empty namespace. This is here solely so we can try to require it and see whether or not EE code is on the classpath.")
38 changes: 38 additions & 0 deletions src/metabase/api/common.clj
Original file line number Diff line number Diff line change
@@ -1,3 +1,41 @@
;; # API Endpoints at Metabase
;;
;; We use a custom macro called `defendpoint` for defining all endpoints. It's best illustrated with an example:
;;
;; <pre><code>
;; (ns metabase.api.dashboard ...)
;;
;; (api/defendpoint GET "/"
;; "Get `Dashboards`. With filter option `f`..."
;; [f]
;; {f [:maybe [:enum "all" "mine" "archived"]]}
;; (let ...))
;;
;; ; ...
;;
;; (api/define-routes)
;; </code></pre>
;;
;; As you can see, the arguments are:
;;
;; * **The HTTP verb.** (`GET`, `PUT`, `POST`, etc)
;; * **The route.** This will automatically have `api` and the namespace prefixed to it, so in this case `"/"` is defining
;; the route for `/api/dashboard/`.
;; * **A docstring.** Apart from being helpful to us, this is used for API documentation for third-party devs, so please
;; be thorough!
;; * **A schema.** This uses [Malli's vector syntax](https://github.com/metosin/malli#vector-syntax). This is documented
;; on Malli's page, of course, but we also have some of our own schemas defined. Start by looking in
;; [`metabase.util.malli.schema`](#metabase.util.malli.schema)
;; * **The parameters.** This uses Compojure's
;; [destructuring syntax](https://github.com/weavejester/compojure/wiki/Destructuring-Syntax) (a superset of Clojure's
;; normal destructuring syntax).
;; * **The actual code for the endpoint.** The returned value could be one of several types. The Right Thing (such as
;; converting to JSON or setting an appropriate status code) usually happens by default. Consult
;; [Compojure's documentation](https://github.com/weavejester/compojure/blob/master/src/compojure/response.clj),
;; but it may be more instructive to look at examples in our codebase.
;;
;; <hr />

(ns metabase.api.common
"Dynamic variables and utility functions/macros for writing API functions."
(:require
Expand Down
2 changes: 1 addition & 1 deletion src/metabase/driver/common/parameters/parse.clj
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@
"Attempts to parse parameters in string `s`. Parses any optional clauses or parameters found, and returns a sequence
of non-parameter string fragments (possibly) interposed with `Param` or `Optional` instances.
If handle-sql-comments is true (default) then we make a best effort to ignore params in SQL comments."
If `handle-sql-comments` is true (default) then we make a best effort to ignore params in SQL comments."
([s :- s/Str]
(parse s true))
([s :- s/Str, handle-sql-comments :- s/Bool]
Expand Down
111 changes: 111 additions & 0 deletions src/metabase/search/scoring.clj
Original file line number Diff line number Diff line change
@@ -1,4 +1,115 @@
;; # How does search scoring work?
;;
;; _This was written for a success engineer, but may be helpful here, too._
;;
;; Most of what you care about happens in the `scoring.clj` file [here](https://github.com/metabase/metabase/blob/master/src/metabase/search/scoring.clj).
;;
;; We have two sets of scorers. The first is based on the literal text matches and defined [here](https://github.com/metabase/metabase/blob/8d5f5db02c84899a053e20468986050b2034a9a4/src/metabase/search/scoring.clj#L132C1-L137):
;;
;; <pre><code>
;; (def ^:private match-based-scorers
;; [{:scorer exact-match-scorer :name "exact-match" :weight 4}
;; {:scorer consecutivity-scorer :name "consecutivity" :weight 2}
;; {:scorer total-occurrences-scorer :name "total-occurrences" :weight 2}
;; {:scorer fullness-scorer :name "fullness" :weight 1}
;; {:scorer prefix-scorer :name "prefix" :weight 1}])
;; </code></pre>
;;
;; * The `exact-match-scorer` gives points for exact matches. So if you search `foo` it'll score well for `foo
;; collection` but not `my favorite foods`. Everything else counts partial matches
;;
;; * `consecutivity-scorer` gives points for a sequence of matching words. So if you search `four five six seven`
;; it'll score well for `one two three four five six seven eight` and 0 for `eight seven six five four three two
;; one`.
;;
;; * `total-occurrences-scorer` gives points for the number of tokens that show up in the search result. So if you
;; search for `foo bar` it'll score better for `Admiral Akbar's Food Truck` (2; note that `akbar` and `food` count
;; as matches even though it's not exact) than for `foo collection` (1; being an exact match doesn't matter. That's
;; why we have the `exact-match-scorer`).
;;
;; * `fullness-scorer` is sort of the opposite of that: it gives points for how much of the result is "covered" by the
;; search query. So if you search `foo bar` then `Barry's Food` will have a perfect fullness score and `Barry's
;; Dashboard Of Favorite Bars, Restaurants, and Food Trucks` will score poorly since only 3/9 of the dashboard's
;; title is covered by the search query. Why 3? `bar` matches both `Barry's` and `Bars`.
;;
;; * `prefix-scorer` gives points for an exact prefix match. So if you search for `foo bar` then `foo collection` will
;; have a good prefix score (4/24: `foo ` matches), `Food trucks I love` will have a worse one (3/18), and
;; `top 10 foo bars` will be zero.
;;
;;
;; These are all weighted: you can see that the exact-match scorer is responsible for 4/10 of the score, the consecutivity one is 2/10, etc.
;;
;; The second set of scorers is defined lower down,
;; [here](https://github.com/metabase/metabase/blob/8d5f5db02c84899a053e20468986050b2034a9a4/src/metabase/search/scoring.clj#L215-L222):
;;
;; <pre><code>
;; (defn weights-and-scores
;; "Default weights and scores for a given result."
;; [result]
;; [{:weight 2 :score (pinned-score result) :name "pinned"}
;; {:weight 2 :score (bookmarked-score result) :name "bookmarked"}
;; {:weight 3/2 :score (recency-score result) :name "recency"}
;; {:weight 1 :score (dashboard-count-score result) :name "dashboard"}
;; {:weight 1/2 :score (model-score result) :name "model"}])
;; </code></pre>
;;
;; And there are two more for Enterprise
;; [here](https://github.com/metabase/metabase/blob/8d5f5db02c84899a053e20468986050b2034a9a4/enterprise/backend/src/metabase_enterprise/search/scoring.clj#L27-L33):
;;
;; <pre><code>
;; (premium-features/has-feature? :official-collections)
;; (conj {:weight 2
;; :score (official-collection-score result)
;; :name "official collection score"})
;; (premium-features/has-feature? :content-verification)
;; (conj {:weight 2
;; :score (verified-score result)
;; :name "verified"})))
;; </code></pre>
;;
;; These are easier to explain: you get points if the search result is pinned (yes or no), bookmarked (yes or no), how
;; recently it was updated (sliding value between 1 (edited just now) and 0 (edited [180+
;; days](https://github.com/metabase/metabase/blob/8d5f5db02c84899a053e20468986050b2034a9a4/src/metabase/search/config.clj#L29-L32)
;; ago), how many dashboards it appears in (sliding value between 0 (zero dashboards) and 1 ([50+
;; dashboards](https://github.com/metabase/metabase/blob/8d5f5db02c84899a053e20468986050b2034a9a4/src/metabase/search/config.clj#L34-L36))
;; and it's type (`model-score`): the earlier a type appears in [this
;; list](https://github.com/metabase/metabase/blob/8d5f5db02c84899a053e20468986050b2034a9a4/src/metabase/search/config.clj#L55-L58)
;; the higher score it gets:
;;
;; <code> ["dashboard" "metric" "segment" "indexed-entity" "card" "dataset" "collection" "table" "action" "database"]</code>
;;
;; On the EE side, we also give points if something's an official collection and if it's verified.
;;
;; Finally, what we actually search is defined in the search
;; config [here](https://github.com/metabase/metabase/blob/8d5f5db02c84899a053e20468986050b2034a9a4/src/metabase/search/config.clj#L73-L109),
;; but the short answer is "the name and, if there is one, the description". We used to search raw SQL queries for
;; cards, but that got turned off recently (but I've seen chat about turning it back on).
;;
;; ❦
;;
;; So, these 12 scorers are weighted and combined together, and the grand total affects search order. If this sounds a
;; little complicated…it is! It also means that it can be tricky to give a proper answer about why the search ranking
;; is "wrong", maybe you search for `monthly revenue` and are looking for a card called `monthly revenue` and are mad
;; that a dashboard called `company stats` shows up first…but then it turns out that the dashboard's description is
;; `Stats that everyone should be aware of, such as our order count and monthly revenue.` and the dashboard happens to
;; be pinned, bookmarked, part of an official collection, verified, and edited a couple hours ago…whereas the card is
;; none of those things.
;;
;; Also, be aware that as of October 2023 there's [a big epic under
;; way](https://github.com/metabase/metabase/issues/27982) to add filtering to search results, which should help
;; people find what they're looking for (and spares us from having to make the above algorithm better).
;;
;; <hr />

(ns metabase.search.scoring
"Computes a relevancy score for search results using the weighted average of various scorers. Scores are determined by
various ways of comparing the text of the search string and the item's title or description, as well as by
Metabase-specific features such as how many dashboards a card appears in or whether an item is pinned.
Get the score for a result with `score-and-result`, and efficiently get the most relevant results with
`top-results`.
Some of the scorers can be tweaked with configuration in [[metabase.search.config]]."
(:require
[clojure.string :as str]
[java-time.api :as t]
Expand Down

0 comments on commit 8f2a82e

Please sign in to comment.