Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql-server: Add behavior: auto_gc_behavior: enable. #8849

Merged
merged 20 commits into from
Feb 24, 2025
Merged

sql-server: Add behavior: auto_gc_behavior: enable. #8849

merged 20 commits into from
Feb 24, 2025

Conversation

reltuk
Copy link
Contributor

@reltuk reltuk commented Feb 11, 2025

When Auto GC is enabled, the running sql-server will periodically collect a Dolt database that is growing in size. This behavior is currently experimental. Tuning the behavior around how often to collect is ongoing work.

When Auto GC is enabled, the running sql-server will periodically
collect a Dolt database that is growing in size. This behavior
is currently experimental. Tuning the behavior around how often to collect is
ongoing work.
@reltuk reltuk marked this pull request as ready for review February 11, 2025 03:55
@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
dc4b94d ok 5937457
version total_tests
dc4b94d 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
9b53d20 ok 5937457
version total_tests
9b53d20 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
62e5032 ok 5937457
version total_tests
62e5032 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
f509d98 ok 5937457
version total_tests
f509d98 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
10dfcec ok 5937457
version total_tests
10dfcec 5937457
correctness_percentage
100.0

@reltuk reltuk requested a review from zachmu February 12, 2025 22:04
Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// delivers when no GC is currently running.
done chan struct{}
// It simplifies the logic and efficiency of the
// implementation a bit to have a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have a what?

type autoGCCommitHook struct {
c *AutoGCController
name string
// Always non-nil, if this channel delivers this channel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a run-on

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
7c9c808 ok 5937457
version total_tests
7c9c808 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
1e20b67 ok 5937457
version total_tests
1e20b67 5937457
correctness_percentage
100.0

…test for auto-gc occurring on a standby replica.
…cks, plus gets triggered by a commit hook.

This helps handle the stanbdy replica case, where commits come in through
remotesrv directly into the ChunkStore, and not through the datas.Database.
@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
4ce49e6 ok 5937457
version total_tests
4ce49e6 5937457
correctness_percentage
100.0

…mount of garbage produced by replication test to improve speed.
@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
837ae43 ok 5937457
version total_tests
837ae43 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
36b39df ok 5937457
version total_tests
36b39df 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
df99169 ok 5937457
version total_tests
df99169 5937457
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
7f1e9dd ok 5937457
version total_tests
7f1e9dd 5937457
correctness_percentage
100.0

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple nits

}, nil
}
} else {
totalSz, err := cs.(chunks.TableFileStore).Size(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding an ok check for this type assertion, rather than panic?

if h.lastSz == nil {
h.lastSz = &sz
}
const size_128mb = (1 << 27)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Threshold values probably should be defined as constants.

Might also consider making these settable via environment vars. A reasonable testing strategy might be running benchmarks / other heavy IO processes multithreaded with a quite low threshold and making sure perf is reasonable / no deadlock happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that needs to be tuneable is gonna be tuneable via config.yaml eventually, but this first pass just has them hard coded...

Same for the timer on the periodic check.

return nil
}

const checkInterval = 100 * time.Millisecond
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems really aggressive. Any reason to be this frequent by default?

Also, make configurable via env?

@coffeegoddd
Copy link
Contributor

@reltuk DOLT

comparing_percentages
100.000000 to 100.000000
version result total
160b434 ok 5937457
version total_tests
160b434 5937457
correctness_percentage
100.0

@reltuk reltuk merged commit b557065 into main Feb 24, 2025
20 of 21 checks passed
Copy link

@coffeegoddd DOLT

test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.07 1.14
batching batch sql 10000 1 0.08 1.5
batching by line sql 10000 1 0.09 1.33
blob 1 blob 200000 1 0.88 4 4.74
blob 2 blobs 200000 1 0.9 4.34 4.73
blob no blob 200000 1 0.91 2.45 2.91
col type datetime 200000 1 0.85 2.39 2.75
col type varchar 200000 1 0.69 3.65 3.93
config width 2 cols 200000 1 0.8 2.56 2.91
config width 32 cols 200000 1 1.9 2.09 2.8
config width 8 cols 200000 1 1.02 2.33 2.67
pk type float 200000 1 2.76 0.77 0.86
pk type int 200000 1 0.83 2.45 2.78
pk type varchar 200000 1 1.51 1.75 1.83
row count 1.6mm 1600000 1 5.77 2.97 2.94
row count 400k 400000 1 1.52 2.73 2.82
row count 800k 800000 1 2.91 2.91 2.97
secondary index four index 200000 1 3.54 1.45 1.33
secondary index no secondary 200000 1 0.94 2.39 2.82
secondary index one index 200000 1 1.16 2.47 2.53
secondary index two index 200000 1 2.02 1.79 1.86
sorting shuffled 1mm 1000000 0 4.93 2.95 2.74
sorting sorted 1mm 1000000 1 4.97 2.91 2.72

Copy link

@coffeegoddd DOLT

name detail mean_mult
dolt_blame_basic system table 1.27
dolt_blame_commit_filter system table 3.05
dolt_commit_ancestors_commit_filter system table 0.63
dolt_commits_commit_filter system table 1
dolt_diff_log_join_from_commit system table 2.84
dolt_diff_log_join_to_commit system table 2.8
dolt_diff_table_from_commit_filter system table 1.18
dolt_diff_table_to_commit_filter system table 1.16
dolt_diffs_commit_filter system table 1.03
dolt_history_commit_filter system table 1.36
dolt_log_commit_filter system table 1.16

Copy link

@coffeegoddd DOLT

name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 1.12
adds_updates_deletes 60000 60000 60000 4.63
deletes_only 0 60000 0 2.5
updates_only 0 0 60000 3.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants