Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ysql_bench: modify init phase to use multiple smaller transactions #3229

Closed
kmuthukk opened this issue Dec 27, 2019 · 1 comment
Closed

ysql_bench: modify init phase to use multiple smaller transactions #3229

kmuthukk opened this issue Dec 27, 2019 · 1 comment
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) good first issue This is a good issue to start contributing!

Comments

@kmuthukk
Copy link
Collaborator

Tried this on 2.0.8:

ysql_bench'es initialize (data seeding) phase seem to be do all the load in one large transaction. With larger scale factor (e.g. -s 20 option loads 2M rows), on a local yb-ctl created cluster we are hitting the following error:

$ ./postgres/bin/ysql_bench -h 127.0.0.1 -p 5433 -i -s 20 yugabyte
dropping old tables...
creating tables (with primary keys)...
generating data...
100000 of 2000000 tuples (5%) done (elapsed 0.02 s, remaining 0.36 s)
200000 of 2000000 tuples (10%) done (elapsed 0.04 s, remaining 0.36 s)
300000 of 2000000 tuples (15%) done (elapsed 1.64 s, remaining 9.32 s)
400000 of 2000000 tuples (20%) done (elapsed 3.18 s, remaining 12.72 s)
500000 of 2000000 tuples (25%) done (elapsed 4.70 s, remaining 14.11 s)
600000 of 2000000 tuples (30%) done (elapsed 6.23 s, remaining 14.55 s)
700000 of 2000000 tuples (35%) done (elapsed 7.03 s, remaining 13.05 s)
800000 of 2000000 tuples (40%) done (elapsed 8.55 s, remaining 12.82 s)
900000 of 2000000 tuples (45%) done (elapsed 10.07 s, remaining 12.31 s)
1000000 of 2000000 tuples (50%) done (elapsed 11.59 s, remaining 11.59 s)
1100000 of 2000000 tuples (55%) done (elapsed 13.11 s, remaining 10.72 s)
1200000 of 2000000 tuples (60%) done (elapsed 14.53 s, remaining 9.69 s)
1300000 of 2000000 tuples (65%) done (elapsed 15.91 s, remaining 8.57 s)
1400000 of 2000000 tuples (70%) done (elapsed 17.30 s, remaining 7.41 s)
1500000 of 2000000 tuples (75%) done (elapsed 19.34 s, remaining 6.45 s)
1600000 of 2000000 tuples (80%) done (elapsed 20.70 s, remaining 5.18 s)
1700000 of 2000000 tuples (85%) done (elapsed 22.09 s, remaining 3.90 s)
1800000 of 2000000 tuples (90%) done (elapsed 23.46 s, remaining 2.61 s)
1900000 of 2000000 tuples (95%) done (elapsed 24.83 s, remaining 1.31 s)
2000000 of 2000000 tuples (100%) done (elapsed 26.20 s, remaining 0.00 s)
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
PQendcopy failed

The ysql_bench data load phase seems to be using the transactional COPY command.

centos 18473 47.0 7.9 1574936 574040 ? Rsl 19:59 0:06 postgres: yugabyte yugabyte 127.0.0.1(51440) COPY

See this error in the log (note the same pid as the above line):

2019-12-27 20:00:54.013 UTC [15466] LOG:  server process (PID 18473) was terminated by signal 9: Killed
2019-12-27 20:00:54.013 UTC [15466] DETAIL:  Failed process was running: copy ysql_bench_accounts from stdin

When running on nodes with more resources, even when the operation succeeds, after the commit step (when control has returned back to client), the cluster stays busy for a while (because the provisional records for the large transaction are finalized in the background).

@kmuthukk kmuthukk added area/ysql Yugabyte SQL (YSQL) good first issue This is a good issue to start contributing! labels Dec 27, 2019
@kmuthukk
Copy link
Collaborator Author

Tried on 2.0.9. Pretty much similar behavior.

sudo perf top view during the commit phase was something like this:

Screen Shot 2019-12-27 at 12 14 52 PM

psudheer21 added a commit that referenced this issue Jan 23, 2020
…bench. (#3371)

* Adding batching support for insertions into the table in ysql_bench.

Summary:
Currently with large scale factors we are facing troubles inserting data
into the table. This is because all of the inserts happen in one large
transaction.

This change involves persisting the rows in much smaller transactions
with the size of the batch being configured using 'batch-size' argument.

Reviewers: Kannan, Mihnea, Neha

* Removing truncation of tables and insertion into ysql_bench_branches and
ysql_bench_tellers from the transaction.

Summary:
Truncate is not yet transactional with Yugabyte. Hence removed the
truncation out of the transaction block.
Simple single row insertions are faster and hence the insertion into the
2 tables would be faster if they are not part of the transaction.

 Reviewers: Kannan, Mihnea, Neha
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) good first issue This is a good issue to start contributing!
Projects
None yet
Development

No branches or pull requests

3 participants