Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance tuning tutorial #3378

Merged
merged 1 commit into from
Aug 13, 2018
Merged

Performance tuning tutorial #3378

merged 1 commit into from
Aug 13, 2018

Conversation

jseldess
Copy link
Contributor

@jseldess jseldess commented Jul 13, 2018

This PR adds a 2.0 performance tuning tutorial. It starts with a single-region deployment, focusing on common SQL techniques for getting faster reads and writes. It then expands into a multi-region deployment, focusing on table partitioning.

The PR also includes a simple Python client for read/write testing. It hopefully makes it easier to run a given statement multiple times and look at the average and/or cumulative latency.

For internal testers, I've added the corresponding roachprod commands as comments in the performance-tuning.md file and in a separate comment below.

Fixes #3160.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jseldess jseldess force-pushed the perf-tuning-single-dc branch 6 times, most recently from 8f07c9c to 9ae41bf Compare July 28, 2018 04:54
@jseldess jseldess force-pushed the perf-tuning-single-dc branch from 9ae41bf to 48ae323 Compare July 28, 2018 04:57
@jseldess jseldess force-pushed the perf-tuning-single-dc branch 3 times, most recently from 83f7d50 to 10d9ee3 Compare July 31, 2018 03:55
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, it is very readable.

CockroachDB requires TCP communication on two ports:

- **26257** (`tcp:26257`) for inter-node communication (i.e., working as a cluster)
- **8080** (`tcp:8080`) for accessing the Admin UI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this needs to become 'web ui', here and below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


2. Note the internal IP address of each `n1-standard-4` instance. You'll need these addresses when starting the CockroachDB nodes.

3. SSH to each instance and [optimize the local SSD for write performance](https://cloud.google.com/compute/docs/disks/performance#optimize_local_ssd) (see the **Disable write cache flushing** section).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reader will see this and pause and think "wait, what?" (like I just did)

disabling write cache flushing means the data becomes less "safe" -- it disables the guarantee that cockroachdb expects from the underlying drive that flushed data is guaranteed to be written to persistent storage.

If you're 100% sure about yourself here, you still need a callout underneath to reassure the reader that this is OK (and explain them why).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I assumed this was safe since it's a technique that we're featuring in our tpc-c benchmarking. @nvanbenschoten, @bdarnell, can either of you confirm that this is in fact safe and, if so, help me come up with language that clarifies that fact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just to be clear, @knz, the reason I'm featuring this is, based on my testing, it significantly improves performance. @robert-s-lee is also impressed with the speedup it gives you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc. @petermattis and @arjunravinarayan. I think they'll be able to give the most accurate answers to the following questions: "is it safe?" and "should we recommend it?".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is not safe to do on cloud instances. It is what one would do to replicate the performance characteristics of an on-prem capacitor-backed SSDs would give you, and this is why we have these instructions in our performance whitepaper. But I do not believe it is suitable guidance for a performance tuning guide in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks, @arjunravinarayan. If @petermattis agrees that this is not safe to do on cloud instances, I'll remove this and rerun all the numbers.

REFERENCES vehicles (city, id);
~~~

**Add a note about why we need vehicle_city and link to ticket. Basically, we can't have 2 foreign keys constraints on the same column (city). So we duplicate city as vehicle_city and add check constraint to ensure that they are identical. https://github.com/cockroachdb/cockroach/issues/23580**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the issue URL a link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note in the ### Schema section. PTAL.


With this information, we can visualize what's happening, assuming the request is sent to node 1 and ignoring non-involved ranges:

<img src="{{ 'images/v2.0/perf_tuning_join1.png' | relative_url }}" alt="Perf tuning concepts" style="max-width:100%" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous diagrams were OK to follow along; this one in contrast has me thoroughly confused: I do not understand the meaning of the arrows that jump from one "lease" box to another on node 2.

If you mean to explain that node 1 is forwarding the full table scan to every range in the "rides" table, that must be done by using:

  • 6 arrows from node 1 to node 2 (you may group them together because they actually flow along a single network link)
  • 1 arrow from node 1 to node 3
  • 6 arrows back from node 2 to the outside of node 1 (for the results)
  • 1 arrow back from node 3 to the outside of node 1

Then for the join there is no extra arrow needed because node 1 is already leaseholder.

There should not be any arrow from node 2 to itself, and neither from node 2 to the "lease" box of node 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @knz. Do you think this section would be sufficient without the diagram? I think they were important in the intro concepts, but I'm unsure whether they're as useful embedded in the tutorial.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the diagram is not necessary here indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it.

(21 rows)
~~~

This is a complex query plan, but the important thing to note is the full table scan of `rides@primary` above the `subquery`. This shows you that, after the subquery returns the IDs of the top 5 vehicles, CockroachDB scans the entire primary index to find the rows with `max(end_time)` for each `vehicle_id`, although you might expect CockroachDB to more efficiently use the secondary index on `vehicle_id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that we're working to lift this limitation in a future version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@knz
Copy link
Contributor

knz commented Jul 31, 2018

Discussed offline:

  • because this guide uses a fixed schema there is no discussion of how schema design impacts performance. I do not know exactly how much ground we should cover here, but at least we must mention that new apps should be careful about which index keys they define for tables (either for primary keys or secondary indexes) because of the impact on contention. Perhaps link to the FAQ entries on this topic.

  • there is a section about "dropping unused secondary indexes" which is already doing a good job. Two useful additions perhaps:

    • add a quick note there that adding a foreign key constraint also adds secondary indexes implicitly, but these are not automatically removed when FK relationship is dropped. So a user should be careful to remove both if they find that they do not need a FK constraint any more.
    • the performance relationship between the number of indexes and the latency of mutations (insert/update/etc) should also be captured in a concise entry of the FAQ page, separately from the performance tuning guide. I think this is important because it comes up often during discussions on gitter/forum etc and the FAQ entry would be more searchable/discoverable than a section in the middle of the perf tuning guide.

CockroachDB requires TCP communication on two ports:

- **26257** (`tcp:26257`) for inter-node communication (i.e., working as a cluster)
- **8080** (`tcp:8080`) for accessing the Admin UI
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

REFERENCES vehicles (city, id);
~~~

**Add a note about why we need vehicle_city and link to ticket. Basically, we can't have 2 foreign keys constraints on the same column (city). So we duplicate city as vehicle_city and add check constraint to ensure that they are identical. https://github.com/cockroachdb/cockroach/issues/23580**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note in the ### Schema section. PTAL.


With this information, we can visualize what's happening, assuming the request is sent to node 1 and ignoring non-involved ranges:

<img src="{{ 'images/v2.0/perf_tuning_join1.png' | relative_url }}" alt="Perf tuning concepts" style="max-width:100%" />
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it.


With this information, we can visualize what's happening now, still assuming the request is sent to node 1 and ignoring non-involved ranges:

<img src="{{ 'images/v2.0/perf_tuning_join2.png' | relative_url }}" alt="Perf tuning concepts" style="max-width:100%" />
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also removed this image.

(21 rows)
~~~

This is a complex query plan, but the important thing to note is the full table scan of `rides@primary` above the `subquery`. This shows you that, after the subquery returns the IDs of the top 5 vehicles, CockroachDB scans the entire primary index to find the rows with `max(end_time)` for each `vehicle_id`, although you might expect CockroachDB to more efficiently use the secondary index on `vehicle_id`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.0/performance-tuning.md, line 141 at r1 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

OK, thanks, @arjunravinarayan. If @petermattis agrees that this is not safe to do on cloud instances, I'll remove this and rerun all the numbers.

Ditto what @arjunravinarayan said. Note that Google Persistent SSD does not flush to the storage media (effectively acting like nobarrier is set), while Local SSD does. If running a high performance database on-prem you'd get hardware that had capacitor-backed or battery-backed SSDs. Future versions of cloud datacenters may have reliable enough batteries to allow recommending this as the default (and I've heard murmurings that some current Google datacenters have reliable batteries), but right now the recommendation should be that you have to know your hardware in order to specify nobarrier and the cloud providers do not provide enough transparency to do this safely.

@jseldess
Copy link
Contributor Author

jseldess commented Aug 1, 2018


v2.0/performance-tuning.md, line 141 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Ditto what @arjunravinarayan said. Note that Google Persistent SSD does not flush to the storage media (effectively acting like nobarrier is set), while Local SSD does. If running a high performance database on-prem you'd get hardware that had capacitor-backed or battery-backed SSDs. Future versions of cloud datacenters may have reliable enough batteries to allow recommending this as the default (and I've heard murmurings that some current Google datacenters have reliable batteries), but right now the recommendation should be that you have to know your hardware in order to specify nobarrier and the cloud providers do not provide enough transparency to do this safely.

OK. Thanks, all. I'll remove this.

@jseldess jseldess force-pushed the perf-tuning-single-dc branch 3 times, most recently from 2cfba8d to 9356ba4 Compare August 5, 2018 12:12
@jseldess jseldess changed the title [WIP] Performance tuning tutorial Performance tuning tutorial Aug 5, 2018
@jseldess
Copy link
Contributor Author

jseldess commented Aug 5, 2018

Roachprod commands for single-region portion:

<!-- roachprod instructions for single-region deployment
1. Reserve 12 instances across 3 GCE zone: roachprod create <yourname>-tuning --geo --gce-zones us-east1-b,us-west1-a,us-west2-a --local-ssd -n 12
2. Put cockroach` on all instances: `roachprod run <yourname>-tuning "curl https://binaries.cockroachdb.com/cockroach-v2.0.4.linux-amd64.tgz | tar -xvz; mv cockroach-v2.0.4.linux-amd64/cockroach cockroach"
3. Start the cluster in us-east1-b: roachprod start <yourname>-tuning:1-3
4. You'll need the addresses of all instances later, so list and record them somewhere: roachprod list -d <yourname>-tuning
5. Import the Movr dataset:
   - SSH onto instance 4: roachprod run <yourname>-tuning:4
   - Run the SQL commands in Step 4 below.
8. Install the Python client:
   - Still on instance 4, run commands in Step 5 below.
9. Test/tune read performance:
   - Still on instance 4, run commands in Step 6.
10. Test/tune write performance:
   - Still on instance 4, run commands in Step 7.
-->

Roachprod commands for multi-region portion:

<!-- roachprod instructions for multi-region deployment
You created all instanced up front, so no need to add more now.
1. Install the Python client on instance 8:
   - SSH to instance 8: roachprod run <yourname>-tuning:8
   - Run commands in Step 5 above.
2. Install the Python client on instance 12:
   - SSH onto instance 12: roachprod run <yourname>-tuning:12
   - Run commands in Step 5 above.
3. Check rebalancing:
   - SSH to instance 4, 8, or 12.
   - Run `SHOW EXPERIMENTAL_RANGES` from Step 11 below.
4. Test performance:
   - Run the SQL commands in Step 12 below. You'll need to SSH to instance 8 or 12 as suggested.
5. Partition the data:
   - SSH to any node and run the SQL in Step 13 below.
6. Check rebalancing after partitioning:
   - SSH to instance 4, 8, or 12.
   - Run `SHOW EXPERIMENTAL_RANGES` from Step 14 below.
7. Test performance after partitioning:
   - Run the SQL commands in Step 15 below. You'll need to SSH to instance 8 or 12 as suggested.
-->

@cockroachdb cockroachdb deleted a comment from cockroach-teamcity Aug 5, 2018
Copy link
Contributor Author

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


_includes/v2.0/performance/tuning.py, line 11 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Add type=int here, and then you can remove the int() around args.repeat below.

Done.


_includes/v2.0/performance/tuning.py, line 29 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Just use n here; you don't need a separate count variable.

Done.


_includes/v2.0/performance/tuning.py, line 39 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Do you really need to catch all exceptions here? I would just remove the try/except block, or if you really want to catch and ignore some exceptions, use a narrower except clause.

This is just in place to prevent exceptions for writes, where there is no colnames or rows to print, e.g.:

python tuning.py --host=10.142.0.42 --statement="INSERT INTO users VALUES (gen_random_uuid(), 'new york', 'Max Roach', '411 Drum Street', '173635282937347')" --repeat=100 --times --cumulative
Traceback (most recent call last):
  File "tuning.py", line 29, in <module>
    colnames = [desc[0] for desc in cur.description]
TypeError: 'NoneType' object is not iterable

Is there a better way?


_includes/v2.0/performance/tuning.py, line 41 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Since we're mostly dealing with very small values, the output would be more reasonable if this were times.append((end - start) * 1000) and change "seconds" to "milliseconds" everywhere.

Done.

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


_includes/v2.0/performance/tuning.py, line 39 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

This is just in place to prevent exceptions for writes, where there is no colnames or rows to print, e.g.:

python tuning.py --host=10.142.0.42 --statement="INSERT INTO users VALUES (gen_random_uuid(), 'new york', 'Max Roach', '411 Drum Street', '173635282937347')" --repeat=100 --times --cumulative
Traceback (most recent call last):
  File "tuning.py", line 29, in <module>
    colnames = [desc[0] for desc in cur.description]
TypeError: 'NoneType' object is not iterable

Is there a better way?

In that case you can guard it all like this:

if cur.description is not None:
    colnames = [desc[0] for desc in cur.description]

@jseldess jseldess force-pushed the perf-tuning-single-dc branch from 6e2ce87 to ffba014 Compare August 7, 2018 19:14
Copy link
Contributor Author

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


_includes/v2.0/performance/tuning.py, line 1 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Add #!/usr/bin/env python as the first line of this script. (or python3. I think this script will work as-is on python 3, and that's the version that is preinstalled on current ubuntu).

One specific benefit of python 3 for this script is the new statistics module which would make it easy to print the standard deviation and median (not just the mean) without installing more packages. The stddev would be especially useful for the partitioning examples.

I'm going with ubuntu 16.04, since that's what we use internally. Seems like it's still python2 on that image, so I'll stick with that for now.


_includes/v2.0/performance/tuning.py, line 39 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

In that case you can guard it all like this:

if cur.description is not None:
    colnames = [desc[0] for desc in cur.description]

Done.


v2.0/performance-tuning.md, line 30 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I don't see any mention of which image to use here. (although the use of apt below implies the use of something in the debian/ubuntu family instead of fedora/redhat). We should specify the base image here (presumably ubuntu 18.04)

Done. Again, going with ubuntu 16.04, since that's what we use internally.


v2.0/performance-tuning.md, line 31 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Instead of "read/write testing", I'd call this "For running the client application workload".

Done.


v2.0/performance-tuning.md, line 128 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/admin/web/

Done.


v2.0/performance-tuning.md, line 169 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

We know this will give a permissions error on the VM setup we're recommending, so just put sudo in the command already.

Done.


v2.0/performance-tuning.md, line 269 at r2 (raw file):

Previously, knz (kena) wrote…

ensure a consistent use of tabs and spaces in this example.

Done.


v2.0/performance-tuning.md, line 406 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

If this is the only external package you need, it might be better to just install it with apt-get instead of pip (using pip without virtualenv is prone to trouble). It also saves a step: apt-get install python-psycopg2 instead of apt-get install python-pip && pip install psycopg2-binary.

Or, as noted above, ubuntu 18.04 doesn't install python 2 by default, so it would be installed here. We could use python 3 instead (which will download and install less stuff here) with apt-get install python3-psyopg2

Done.

Going with ubuntu 16.04 since that's what we use internally.


v2.0/performance-tuning.md, line 450 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Optional: if you add chmod +x tuning.py after downloading it, the invocations can just be ./tuning.py instead of python tuning.py. This lets you avoid repeating the version of python used.

Done.

@jseldess jseldess force-pushed the perf-tuning-single-dc branch from ffba014 to 66f14f8 Compare August 7, 2018 19:18
@robert-s-lee
Copy link
Contributor

  • A few notes about the schema uses the compound key. composite key is used at https://www.cockroachlabs.com/docs/v2.0/split-at.html#split-a-table-with-a-composite-primary-key to refer to the same concept. There is a nuance to composite key vs compound key. Is this truly compound key?

  • The rides table contains both city and the seemingly redundant vehicle_city. This redundancy is necessary leaving myself a note to double check

  • Leaseholder mention this is CockroachDB performance optimization of Raft Protocol

  • on the diagram, having the same colors for leaseholder and replicas took time to explain. https://docs.google.com/presentation/d/1Tiq3lNmU-kOtJuppSqhYh0sAps2Rcs2rl7uH7eZ-z0c/edit#slide=id.g37076dbdf1_0_608 has @Kuan thoughts on having the leaseholder and replica be the same color but hallow to make it easier to identify and highlight the importance of the leaseholder.

  • the sequence diagram mention write, but does not mention CockroachDB waits for the write to the disk. This is an important distinction as some other databases don't necessarily wait thus not assuming multiple nodes holding the data in RAM is good enough for durability. CockroachDB offers strong durability and maybe a quick description. This will also aid in why having a fast disk is important to CockroachDB as well as the fast network.

  • assuming network and io as the major latency bottleneck, 1ms network hops x 2 for read result in Retrieving a single row based on the primary key will usually return in 2ms or less: why response time of 2ms is expected and how the change in latency of network and IO will have big impact in performance.

  • the same for the write.

@jseldess jseldess force-pushed the perf-tuning-single-dc branch from 66f14f8 to 85c38b1 Compare August 9, 2018 17:11
@jseldess
Copy link
Contributor Author

jseldess commented Aug 9, 2018

Thanks, @robert-s-lee. I've changed compound to composite (based on this wikipedia entry, I was just using the wrong term), and I've included more details about how the leaseholder mechanism bypasses Raft. I'm not sure how to proceed on the other points. We can talk about them in person.

@jseldess
Copy link
Contributor Author

@robert-s-lee, I expanded "Important concepts" to cover more about the Raft log and how it plays into writes, and I added a brief emphasis of network and disk i/o as performance bottlenecks. I don't think we can go into greater details in this tutorial without making the material much harder to parse. But I opened a docs issue to document the read and write paths in more detail separately. PTAL.

@jseldess jseldess force-pushed the perf-tuning-single-dc branch from 2a85b3b to 0405709 Compare August 11, 2018 04:07
@jseldess jseldess force-pushed the perf-tuning-single-dc branch from 0405709 to 87c1f10 Compare August 11, 2018 06:32
Copy link
Contributor

@sploiselle sploiselle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holy moly, this is incredible! Such an epic undertaking handled so deftly.

I left a small number of small comments, though I have two larger/structural things I'd consider.

Modularization

This is an epic guide, which might be daunting for some people to delve all the way into. If there's a way in which we could break the content out into smaller, more targeted use cases, I wonder if it might be slightly easier to get people to engage with it?

One possible tack for this would be to convert the meat of the content into a number of includes, which you could then embed in both this guide, as well as some smaller ones. Given the structure of this guide, it seems like a possible treatment would be four smaller guides:

  • Optimize Writes/Reads for Single-Region Deployments (i.e. 2 guides; one for reads, one for writes)
  • Optimize Writes/Reads for Multi-Region Deployments (ditto)

That being said, this suggestion is more intuition and idle speculation.

Making Promises Up Front

You did so much work to gather this information and substantiate the claims we make. I'd love to see more of this info surfaced higher in the document and made more of a headline, e.g.

"Using the procedures outlined in this guide we improved:

  • Reads in a single region by X%
  • Writes in a single region by Y%
  • Reads in multiple regions by Z%
  • Writes in multiple regions by W%"

Obviously you'd need a caveat blah blah, but I think that expressing the value of this guidance with numbers could make it much more exciting.

That all being said, this is still such an impressive guide. Bravo.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


images/v2.0/perf_tuning_movr_schema.png, line 0 at r3 (raw file):
I would suggest changing the order of id and city in these tables. That they're in the inverted order feels unintuitive.


v2.0/performance-tuning.md, line 35 at r3 (raw file):

### Schema

You'll use sample schema and data for Cockroach Lab's fictional vehicle-sharing company, [MovR](https://github.com/cockroachdb/movr):

Having a brief description of what this schema represents might be helpful for creating a mental model of what these things represent.


v2.0/performance-tuning.md, line 216 at r3 (raw file):

    ~~~

4. Start the [built-in SQL shell](use-the-built-in-sql-client.html), pointing it one of the CockroachDB nodes:

point it at one


v2.0/performance-tuning.md, line 344 at r3 (raw file):

    Referencing columns | Referenced columns
    --------------------|-------------------
    `vehicles.city/vehicles.owner_id` | `users.city/users.id`

nit: I was expecting these slashes to be commas


v2.0/performance-tuning.md, line 535 at r3 (raw file):

#### Filtering by a secondary index

To speed up this query, add a secondary index on `name`:

This feels like it ought to be run through cockroach instead of tuning.py; ditto for all of the DDL statements.


v2.0/performance-tuning.md, line 944 at r4 (raw file):

~~~

This tells us that the index is stored in 2 ranges, with the leaseholders for both of them on node 1. We already know that the leaseholder for the `users` table is on node 2.

It's unclear to me what "We already know that the leaseholder for the users table is on..." is referring to.


v2.0/performance-tuning.md, line 1360 at r4 (raw file):

-->

Given that Movr is active on both US coasts, you'll now scale the cluster into two new regions, us-west1-a and us-west2-a, each with 3 nodes and an extra instance for simulating regional client traffic.

us-west1-a & co. could maybe use some kind of consistent highlighting?

@jseldess jseldess force-pushed the perf-tuning-single-dc branch from 87c1f10 to 8b04a2b Compare August 13, 2018 16:12
Copy link
Contributor Author

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review, @sploiselle! I made all of the small changes, and I'll add the promises up front as you suggest soon. I also agree that this is way too big and overwhelming for most users. Since I've been working on this version for a long while already, I'll publish the epic now and revisit ways to make it more modular and digestible as a next step. I like your ideas, though.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


images/v2.0/perf_tuning_movr_schema.png, line at r3 (raw file):

Previously, sploiselle (Sean Loiselle) wrote…

I would suggest changing the order of id and city in these tables. That they're in the inverted order feels unintuitive.

I agree, and I wish we could! Because we're partitioning on city later, in the multi-region phase, city needs to be the first column in the primary key.


v2.0/performance-tuning.md, line 35 at r3 (raw file):

Previously, sploiselle (Sean Loiselle) wrote…

Having a brief description of what this schema represents might be helpful for creating a mental model of what these things represent.

Done.


v2.0/performance-tuning.md, line 216 at r3 (raw file):

Previously, sploiselle (Sean Loiselle) wrote…

point it at one

Done.


v2.0/performance-tuning.md, line 535 at r3 (raw file):

Previously, sploiselle (Sean Loiselle) wrote…

This feels like it ought to be run through cockroach instead of tuning.py; ditto for all of the DDL statements.

Done.


v2.0/performance-tuning.md, line 944 at r4 (raw file):

Previously, sploiselle (Sean Loiselle) wrote…

It's unclear to me what "We already know that the leaseholder for the users table is on..." is referring to.

Yeah, it's a few paragraphs above. I've made this a bit more explicit, but it's probably still tricky to follow.


v2.0/performance-tuning.md, line 1360 at r4 (raw file):

Previously, sploiselle (Sean Loiselle) wrote…

us-west1-a & co. could maybe use some kind of consistent highlighting?

Done.

Copy link
Contributor

@sploiselle sploiselle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


images/v2.0/perf_tuning_movr_schema.png, line at r3 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

I agree, and I wish we could! Because we're partitioning on city later, in the multi-region phase, city needs to be the first column in the primary key.

Would it be possible to just put city first in the diagram?

@tim-o
Copy link
Contributor

tim-o commented Aug 13, 2018

@jseldess - this is really incredible. I didn't run into any edits or nits worth picking. I second Sean's thoughts on the epic scale. One thought while I was reading through it: maybe a series of blog posts? In any case, agreed that publishing now and modularizing later is a good approach.

@jseldess jseldess force-pushed the perf-tuning-single-dc branch from 8b04a2b to 6e3f04d Compare August 13, 2018 21:22
Copy link
Contributor Author

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR, @tim-o. Blog posts are certainly an option. In any case, I need to work on making the thing more digestible.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


images/v2.0/perf_tuning_movr_schema.png, line at r3 (raw file):

Previously, sploiselle (Sean Loiselle) wrote…

Would it be possible to just put city first in the diagram?

Yes. I'm out of time for today's release, though, so I'll make this a follow-up.

@jseldess jseldess merged commit d947b85 into master Aug 13, 2018
@jseldess jseldess deleted the perf-tuning-single-dc branch August 13, 2018 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants