-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable sse2 for CSV parsing. #2977
Conversation
&& *next_pos != delimiter && *next_pos != '\r' && *next_pos != '\n') /// NOTE You can make a SIMD version. | ||
++next_pos; | ||
|
||
[&]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why lambda? Just code block {...}
is Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a early return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
|
||
[&]() { | ||
#if __SSE2__ | ||
auto rc = _mm_set1_epi8('\r'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we include the corresponding ...intrin.h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really cool 👍
dbms/src/IO/ReadHelpers.cpp
Outdated
__v2du a = reinterpret_cast<__v2du>(_mm_cmpeq_epi8(bytes, rc)); | ||
__v2du b = reinterpret_cast<__v2du>(_mm_cmpeq_epi8(bytes, nc)); | ||
__v2du c = reinterpret_cast<__v2du>(_mm_cmpeq_epi8(bytes, dc)); | ||
__m128i eq = reinterpret_cast<__m128i>(a | b | c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. But I don't understand, how is it better than writing two _mm_or_si128 instead?
Isn't the __v2du
less portable or less documented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, lemme check the output assembly. gcc folks haven't answered back yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're exactly the same. I'll change to _mm_or_si128
&& *next_pos != delimiter && *next_pos != '\r' && *next_pos != '\n') /// NOTE You can make a SIMD version. | ||
++next_pos; | ||
|
||
[&]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
Testing data ``` select 'aaaaaaaa,bbbbbbbb,cccccccc,dddddddd,eeeeeeee,ffffffff,gggg,hhh' from numbers(3000000) into outfile '/tmp/test.csv' ``` Testing command ``` echo "select count() from file('/tmp/test.csv', CSV, 'a String, b String, c String, d String, e String, f String, g String, h String') where not ignore(e)" | clickhouse-benchmark ``` Before ``` QPS: 1.317, RPS: 3949749.687, MiB/s: 478.380, result RPS: 1.317, result MiB/s: 0.000. 0.000% 0.704 sec. 10.000% 0.712 sec. 20.000% 0.718 sec. 30.000% 0.726 sec. 40.000% 0.739 sec. 50.000% 0.754 sec. 60.000% 0.770 sec. 70.000% 0.788 sec. 80.000% 0.798 sec. 90.000% 0.815 sec. 95.000% 0.826 sec. 99.000% 0.850 sec. 99.900% 0.857 sec. 99.990% 0.858 sec. ``` After ``` QPS: 1.533, RPS: 4598308.336, MiB/s: 556.932, result RPS: 1.533, result MiB/s: 0.000. 0.000% 0.626 sec. 10.000% 0.635 sec. 20.000% 0.639 sec. 30.000% 0.642 sec. 40.000% 0.643 sec. 50.000% 0.645 sec. 60.000% 0.649 sec. 70.000% 0.652 sec. 80.000% 0.658 sec. 90.000% 0.682 sec. 95.000% 0.710 sec. 99.000% 0.727 sec. 99.900% 0.733 sec. 99.990% 0.734 sec. ```
Testing data
Testing command
Before
After
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en