-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: avoid sql -> c-dep dependency edges #30001
Comments
As discussed in cockroachdb#30001, many packages contain code that ideally is usable outside of the cockroach binary and build process. However, making that code buildable outside out usual build process requires ensuring that it does not depend on our make-built c-deps. This adds a (skipped) linter that inspects the transtive deps of some key packages (protos and sql so far). Release note: none.
30019: roachtest: for dryrun, skip initBinaries r=ridwanmsharif a=ridwanmsharif For a dryRun since nothing is actually being run, it is not necessary to have any of the binaries at all. I ran into this when just trying to list the roachtests but couldn't because of this issue. Can do so now, without any problems. cc @andreimatei for helping solve the issue. If this was a desired behaviour do let me know Release note (roachtest): changes behaviour of `roachtest run -n` to not check or initialise binaries as they are not required for a dry run 30029: *: linter to keep c-deps out of some packages r=dt a=dt As discussed in #30001, many packages contain code that ideally is usable outside of the cockroach binary and build process. However, making that code buildable outside out usual build process requires ensuring that it does not depend on our make-built c-deps. This adds a (skipped) linter that inspects the transtive deps of some key packages (protos and sql so far). Release note: none. 30030: build: revert to outputing the stress output to the build log r=andreimatei a=andreimatei My recent change to teamcity-stress.sh piped the stress output to test2json cause it sounded like a good idea, but I've then realized that in the process we've lost that output from the build log (relegating it only to artifacts/stress.log). I like the build log, so this patch puts the output back there. Release note: None 30032: distsql: reduce memory usage for subqueries exec r=jordanlewis a=arjunravinarayan Subqueries in distsql are executed, and their results stored in a RowContainer. The rows are then serialized into expressions and sent to the outer query execution. Previously, the expressions were constructed by scanning over the RowContainer, so twice as much memory was used. Now, we construct the expressions by popping rows one-by-one, so that we use less memory when there are many rows. Release note: None Co-authored-by: Ridwan Sharif <[email protected]> Co-authored-by: David Taylor <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Arjun Narayan <[email protected]>
More work on cockroachdb#30001. This splits some pure-go MVCC encoding/decodeing logic out of stroage/engine into a new package storage/mvcc so it no longer shares a package any of the rocksdb interfacing code that lives in storage/engine. This is so that callers can import and use these functions without introducing our c-deps code to their dependency closure, which is an issue if callers might end up being used outside of the cockroach build environment or share packages with code that is. This is just a mechanical code move with a following rename to avoid repeating the package name in func/var prefixes, eg. MVCCKey -> mvcc.Key, and only includes MVCCKey as well as those functions that were in-use within sqlbase. Significant MVCC encoding/decoding code reamins that can be moved later as needed. Release note: none.
Working on cockroachdb#30001. This moves a couple key decoding helpers from engine, where they share a package with lots of c-deps, to enginepb. This leaves the old function signatures in place, but calling through to their moved implementations. This allows sqlbase to call these helpers without pulling in engine and its cdeps dependencies. Release note: none.
Working on cockroachdb#30001. This moves a couple key decoding helpers from engine, where they share a package with lots of c-deps, to enginepb. This leaves the old function signatures in place, but calling through to their moved implementations. This allows sqlbase to call these helpers without pulling in engine and its cdeps dependencies. Release note: none.
Working on cockroachdb#30001. This moves a couple key decoding helpers from engine, where they share a package with lots of c-deps, to enginepb. This leaves the old function signatures in place, but calling through to their moved implementations. This allows sqlbase to call these helpers without pulling in engine and its cdeps dependencies. Release note: none.
Working on cockroachdb#30001. This moves a couple key decoding helpers from engine, where they share a package with lots of c-deps, to enginepb. This leaves the old function signatures in place, but calling through to their moved implementations. This allows sqlbase to call these helpers without pulling in engine and its cdeps dependencies. Release note: none.
30082: engine: move a couple decoding helpers to enginepb r=nvanbenschoten a=dt Working on #30001. This moves a couple key decoding helpers from engine, where they share a package with lots of c-deps, to enginepb. This leaves the old function signatures in place, but calling through to their moved implementations. This allows sqlbase to call these helpers without pulling in engine and its cdeps dependencies. Release note: none. Smaller alternative to #30043. Co-authored-by: David Taylor <[email protected]>
I would also love to get the parser package to not depend on util/log or build. Just putting this on the wishlist. I may work on it at some point. This would make, for example, |
+1 I just made a project to be able to use https://github.com/lopezator/sqlfmt Works out of the box simply with a "go get" and has cross-platform precompiled binaries. |
31292: *: move RowFetcher and co to a new package, row r=jordanlewis a=jordanlewis This was purely mechanical renaming. The purpose is to lessen the width of the sqlbase package, which is imported in many places. The new package now depends on sqlbase, but far fewer packages consume it than sqlbase. Major changes: RowFetcher -> row.Fetcher RowUpdater -> row.Updater RowDeleter -> row.Deleter FK and cascade stuff -> row package The only unfortunate thing here is that the FK enums all begin with `row` now (e.g. `row.CheckDeletes`). It would be better if those got a different package name, like `fk`, but I'll leave that refactor for another time. Somewhat related to #30001. cc @dt, @benesch Release note: None Co-authored-by: Jordan Lewis <[email protected]>
I think this is the full list. Very manageable. |
fwiw, making |
Fair enough, but I feel like |
31317: storage: move TestingKnobs to storagebase r=jordanlewis a=jordanlewis ... to permit users of TestingKnobs not to have to import all of storage. Also, delete an unused TestingKnob that accepted Replica as a parameter. Updates #30001. Release note: None Co-authored-by: Jordan Lewis <[email protected]>
31322: storage: move protos to storagepb r=jordanlewis a=jordanlewis This is a mechanical change that moves the protobufs that used to live in storage to a new package, storagepb. The purpose of this change is to permit packages (such as sql) that need just protobuf definitions in storage to not have to depend on the whole storage package. First 2 commits are #31309. Updates #30001. Release note: None Co-authored-by: Jordan Lewis <[email protected]>
31312: storage: move MergeQueueEnabled to storagebase r=jordanlewis a=jordanlewis ... to avoid dependency edge from sql to storage. Updates #30001. Release note: None Co-authored-by: Jordan Lewis <[email protected]>
31325: sql: assert no dependencies on storage or CGo r=jordanlewis a=jordanlewis No longer necessary for sql to depend on either storage or CGo - let's keep it that way! All commits but last one from dependent PRs. Closes #30001. 31337: distsqlrun: fix tablereader microbenchmark r=jordanlewis a=jordanlewis It wasn't doing any work besides the initial fetch by accident. Oops! Numbers look much more realistic now. ``` name old time/op new time/op delta TableReader/rows=16-8 68.1µs ± 2% 77.7µs ± 1% +14.20% (p=0.000 n=10+10) TableReader/rows=256-8 139µs ± 3% 243µs ±11% +75.45% (p=0.000 n=10+10) TableReader/rows=4096-8 745µs ± 1% 2172µs ± 1% +191.62% (p=0.000 n=10+9) TableReader/rows=65536-8 9.94ms ± 2% 33.45ms ± 1% +236.60% (p=0.000 n=10+10) name old speed new speed delta TableReader/rows=16-8 3.76MB/s ± 2% 3.29MB/s ± 1% -12.44% (p=0.000 n=10+10) TableReader/rows=256-8 29.5MB/s ± 3% 16.9MB/s ±10% -42.89% (p=0.000 n=10+10) TableReader/rows=4096-8 88.0MB/s ± 1% 30.2MB/s ± 1% -65.71% (p=0.000 n=10+9) TableReader/rows=65536-8 106MB/s ± 2% 31MB/s ± 1% -70.29% (p=0.000 n=10+10) name old alloc/op new alloc/op delta TableReader/rows=16-8 9.60kB ± 2% 10.06kB ± 2% +4.78% (p=0.000 n=10+10) TableReader/rows=256-8 9.50kB ± 1% 17.93kB ± 2% +88.67% (p=0.000 n=9+9) TableReader/rows=4096-8 9.58kB ± 2% 142.82kB ± 3% +1390.41% (p=0.000 n=8+10) TableReader/rows=65536-8 20.8kB ±93% 2090.0kB ± 1% +9924.14% (p=0.000 n=10+9) name old allocs/op new allocs/op delta TableReader/rows=16-8 78.6 ± 1% 79.7 ± 1% +1.40% (p=0.001 n=10+10) TableReader/rows=256-8 79.0 ± 0% 80.0 ± 0% +1.27% (p=0.000 n=9+8) TableReader/rows=4096-8 79.0 ± 0% 83.7 ±12% +5.95% (p=0.000 n=8+10) TableReader/rows=65536-8 100 ±45% 453 ± 3% +354.81% (p=0.000 n=10+9) ``` Release note: None 31362: ui: Have the range debug page correctly handle missing lease times r=BramGruneir a=BramGruneir Before this change, the value was always assumed to be not-null and it was null would crash. It will now correctly handled the missing value and display a `no timestamp` warning. Fixes #31260. Release note (bug fix): The range debug page will now correctly handle cases in which there is no lease start or expiration time. data:image/s3,"s3://crabby-images/d7042/d70422b17245189279eb6cb9fb9285ffe1e5f5c6" alt="screen shot 2018-10-15 at 10 35 07" Co-authored-by: Jordan Lewis <[email protected]> Co-authored-by: Bram Gruneir <[email protected]>
Some of cockroach's code is likely useful outside of just the cockroach binary. Most obvious is some of our general purpose util packages, but additionally anyone building tools that interact with Cockroach or its data such as api responses, on-disk data, backups, etc may want to reuse our defined protobufs and structs as well as the helpers and other functions we've defined around them. Such external tools, both from first or third parties, are common in healthy, thriving OSS ecosystems, and we ideally should, where practical, make it as easy as possible to use our reusable helpers both in and outside the cockroach repo.
However a point of friction currently is that Cockroach uses
make
to build, including to run just-in-time code generation and to handle building c dependencies for linking by cgo. Non-make driven builds of the majority of the codebase simply do not work. Solving the generation problem has been previously addressed by the creation of thecockroach-gen
mirror of the repo, maintained with pre-generated code by CI.However cgo remains a problem, rendering any package with transitive dependencies on cgo unusable outside the cockroach binary. This is unlikely to change given the platform-specific nature of our c builds and how dependent they are on
make
. However we can minimize the number of our Go packages that become impossible to consume externally by avoiding depending, including transitively, on cgo.Current examples:
mvcc
encoding instorage/engine
is used in sql and elsewhere, but sincestorage/engine
also handles interfacing with rocksdb, pulling in pure-Go mvcc logic pulls in cgo.distsqlrun
uses interfaces that are defined inengine
and even creates stores that are backed by rocksdb, thus depending on in rocksdb directly.This will likely require care in splitting some of the low-level packages, defining interfaces and splitting cgo-backed implementations out into separate packages. In the of the temp-store example, it would also require some form of dependency injection as it is actually calling directly into a cgo-backed implementation.
Additionally all of this is bound to rot if not enforced by machines, as large dependency graphs can easily hide edges from human inspection, so we'll want to make a list of packages that a lint inspects, to ensure their transitive dependency closures remain free of cgo imports.
The text was updated successfully, but these errors were encountered: