Skip to content

Commit

Permalink
[PLAT-15707] Updated ntpq parsing to assume milliseconds, not seconds
Browse files Browse the repository at this point in the history
Summary:
ntpq -p offset returns the offset in milliseconds, not in seconds, updated the parsing as needed.

In addition, improved the awk check to look for the line starting with '*' as that is the active server.
If there is no active server, return error

Test Plan:
deployed a universe and used ntp. Validated health checks passed with no time drift and
failed with a drift

Reviewers: muthu, nsingh

Reviewed By: muthu

Subscribers: yugaware

Differential Revision: https://phorge.dev.yugabyte.com/D39150
  • Loading branch information
shubin-yb committed Oct 21, 2024
1 parent fc53c93 commit 5f6afb7
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,15 @@ check_clock_sync_chrony() {

check_clock_sync_ntpd() {
#local skew=$(ntpq -c "rv 0 clock" | awk -F'[=,]' '/offset/ {print $3}')
local skew=$(ntpq -p | awk 'NR==3 {print $9}')
local skew=$(ntpq -p | awk '$1 ~ "^*" {print $9}')
local acceptable_skew_ms=$(python -c 'print('${acceptable_clock_skew_sec}' * 1000)')

if [[ -z "$skew" ]]; then
echo "ntpd is not initialized"
return 1
fi

if awk 'BEGIN{exit !('"$skew"' < '"$acceptable_clock_skew_sec"')}'; then
if awk 'BEGIN{exit !('"$skew"' < '"$acceptable_skew_ms"')}'; then
echo "Clock skew is within acceptable limits: $skew ms"
return 0
else
Expand Down
8 changes: 4 additions & 4 deletions managed/src/main/resources/health/node_health.py.template
Original file line number Diff line number Diff line change
Expand Up @@ -1835,7 +1835,7 @@ class NodeChecker():

service_error = service_status == 0
return e.fill_and_return_entry(
["%s ms" % drift_ms, "ntp service is%s running" % "not" if service_error else ""],
["%s ms" % drift_ms, "ntp service is%s running" % " not" if service_error else ""],
has_error=service_error, metrics=metrics)

def check_process_stats(self, process_name):
Expand Down Expand Up @@ -2137,9 +2137,9 @@ def _ntp_get_clock_drift_ms():
ntp_out = check_output("systemctl status ntp.service", env)
ntpd_out = check_output("systemctl status ntpd.service", env)
if "Active: active" in ntp_out or "Active: active" in ntpd_out:
out = check_output("ntpq -p | awk 'NR==3 {print $9}'", env)
if "Error" not in out:
return int(float(out)*1000) # Convert seconds to milliseconds
out = check_output("ntpq -p | awk '$1 ~ \"^*\" {print $9}'", env)
if "Error" not in out and out.strip() != "":
return int(float(out)) # ntpq -p offset is already in milliseconds
return "Failed to get clock drift from ntp(d)"

def _timesyncd_get_clock_drift_ms():
Expand Down

0 comments on commit 5f6afb7

Please sign in to comment.