Skip to content

Commit

Permalink
[#21963] YSQL: Leverage aws-clock-bound library to reduce read restar…
Browse files Browse the repository at this point in the history
…t errors.

Summary:
### Motivation

Prior to this revision, the physical clock uses a constant 500ms time window for the possible clock skew between any two nodes in the cluster. The skew is very conservative since it is a constant and we need to account for the worst case scenarios. This leads to an excessive number of read restart errors, see https://docs.yugabyte.com/preview/architecture/transactions/read-restart-error/.

A better approach handles the clock error dynamically. This can be done by leveraging the AWS clockbound library. Since, the clock error is several orders of magnitude lower than the conservative constant bound, we raise much fewer read restart errors. In fact, the read latency improves significantly for the SQLStaleReadDetector yb-sample-apps workload.

This revision improves clock precision. It also limits the impact of faulty clocks on the cluster since only those nodes that are out of sync crash.

### About Clockbound

As mentioned above, we use the clockbound library to retrieve the uncertainty intervals for timestamps. Clockbound works in a server-client architecture where a clock-bound-d daemon is registered as a systemd service. This daemon requests chronyd for timestamp related information and publishes the clock accuracy information and clock synchronization status to shared memory. The clockbound client then computes the current timestamp uncertainty interval based on the information in the shared memory.

NOTE: chronyd does not have sufficient information when using PTP. In such cases, clockbound augments clock error with error information from special device files.

### Configuration

Configuring clockbound is a two-step process.

1. Configure the system to setup precise timestamps.
2. Configure the database to use these precise timestamps.

#### System Configuration

```
[PHC available] sudo bash ./bin/configure_ptp.sh
sudo bash ./bin/configure_clockbound.sh
```

#### Database Configuration

Set tserver and master gFlag `time_source=clockbound`.

#### yugabyted Configuration

Autodetects AWS clusters and recommends configuring clockbound.

Provides `--enhance_time_sync_via_clockbound` flag in `yugabyetd start` command.

1. Prechecks for chrony and clockbound configuration.
2. Configures the database with time_source=clockbound.
3. Autodetects PTP and configures clockbound_clock_error_estimate to an appropriate value.

### Design

#### Clockbound Client

The clockbound client library is compiled and packaged in the third party library repo. This is a library written in Rust that is linked to tserver and accessed through its C interface.

#### Clockbound Clock

Uses the clockbound library to get the uncertainty intervals. See the comment on clockbound_clock.cc for more information.

#### Fault Tolerance

Crash and, as a result, temporarily remove the node from Raft groups it is in when clocks go out of sync. This will prevent stale read anomalies. Crashing also prevents the node from killing other nodes in the cluster since it no longer propagates extremely skewed timestamps.

#### Utilities

Includes the following additional utilities

1. configure_ptp.sh
  - Installs network driver compiled with PHC.
  - Configures chrony to use PHC as refclock.
2. configure_clockbound.sh
  - Setup chrony to give accurate timestamp uncertainty intervals.
  - Setup clockbound agent.
  - Setup permissions.
3. clockbound_dump
  - Dumps the result of clockbound_now client side API.
  - Useful for computing clock error in external applications such as YBA.
Jira: DB-10879

Test Plan:
Jenkins: urgent, compile only

### Quick Benchmark (Not statistically significant)

Ran the SqlStaleReadDetector workload that

1. Increments random counters in write threads.
2. Aggregates the counter values in the read thread.

for 5mins and measures the number of restart read requests and the read latency per operation.

| Measurement              | WallClock | NtpClock | ClockboundClock | EST_ERROR=0 | NTP/PHC | PTP/PHC |
|--------------------------|------------|----------------|------------------|--------------|----------|-----|
| Restart Read Requests     | ~5k        | ~380        | ~70              | ~36            | ~5          |  ~5         |
| Latency (ms/op)           | ~430       | ~150         | ~120             | ~105 | ~140*        | ~150*       |

The latencies are measured on the client side.

| **Wall Clock** | Current clock implementation. |
| **Clockbound Clock** | Proposed wall clock compatible clock implementation. |
| **EST_ERROR=0** | When using now=earliest, global_limit=latest where reference clock is in interval [earliest, latest]. |
| **NTP/PHC** | Same but when running the database in the US N Virginia region where PHC is available. |
| **PTP/PHC** | Same but using PTP for timestamps. |

*Higher latency is expected with PHC since the client is present in Oregon and the database is running in N. Virginia.

### Other benchmarks

Developed a few realistic apps in yb-sample-apps.

1. SqlEventCounter
2. SqlBankTransfers
3. SqlWarehouseStock
4. SqlMessageQueue
5. SqlConsistentHashing

They all demonstrate a reduction of several orders of magnitude in read restart errors, reinforcing the value of using AWS Time Sync Service and clockbound.

### Failure Scenarios

1. When clockbound is not setup and user configures time_source=clockbound,

The database fails to start with an error in tserver.err log.

```
F0826 17:47:53.453330  4432 hybrid_clock.cc:157] Couldn't get the current time: Clock unsynchronized. Status: IO error (yb/util/clockbound_time.cc:145): clockbound API failed with error: No such file or directory, and detail: open
...
```

2. When selinux permissions are not set correctly for clockbound to access chronyd socket,

The systemctl status shows an error

```
Aug 26 17:55:57 ip-10-9-10-243.us-west-2.compute.internal clockbound[32122]: 2024-08-26T17:55:57.318518Z ERROR ThreadId(02) /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clock-bound-d-1.0.0/src/chrony_poller.rs:73: No reply from chronyd. Is it running? Error: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }
```

Backport-through: 2024.2

Reviewers: sergei, mbautin, pjain

Reviewed By: sergei, mbautin, pjain

Subscribers: svc_phabricator, mbautin, sergei, rthallam, smishra, yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D37365
  • Loading branch information
pao214 committed Oct 9, 2024
1 parent f51e54d commit 28f27ee
Show file tree
Hide file tree
Showing 22 changed files with 1,653 additions and 13 deletions.
7 changes: 6 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -627,7 +627,7 @@ endif()
function(ADD_THIRDPARTY_LIB LIB_NAME)
set(options)
set(one_value_args SHARED_LIB STATIC_LIB)
set(multi_value_args DEPS)
set(multi_value_args DEPS INCLUDE_DIRS)
cmake_parse_arguments(ARG "${options}" "${one_value_args}" "${multi_value_args}" ${ARGN})
if(ARG_UNPARSED_ARGUMENTS)
message(SEND_ERROR "Error: unrecognized arguments: ${ARG_UNPARSED_ARGUMENTS}")
Expand All @@ -653,6 +653,11 @@ function(ADD_THIRDPARTY_LIB LIB_NAME)
PROPERTIES IMPORTED_LINK_INTERFACE_LIBRARIES "${ARG_DEPS}")
endif()

if (ARG_INCLUDE_DIRS)
target_include_directories(${LIB_NAME}
SYSTEM INTERFACE "${ARG_INCLUDE_DIRS}")
endif()

# Set up an "exported variant" for this thirdparty library (see "Visibility"
# above). It's the same as the real target, just with an "_exported" suffix.
# We prefer the static archive if it exists (as it's akin to an "internal"
Expand Down
Loading

0 comments on commit 28f27ee

Please sign in to comment.