-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<execution>: Parallelize more algorithms #7
Comments
Should Here's how I try to compensate lack of #include <vector>
#include <execution>
std::vector<int> v_a(16);
std::vector<int> v_b(16);
std::vector<int> v_c(16);
void sum(int* c, int* a, int* b)
{
std::generate_n(std::execution::unseq, c, 16, [=]() mutable {
return *(a++) + *(b++);
});
}
int main()
{
sum(v_c.data(), v_a.data(), v_b.data());
} |
Your sample program is a great demonstration of why |
Why does execution policy parameter even exist for them? |
Because the 'design' process for the parallel algorithms was 'if nobody can come up with a reason why it can't be parallelized, without looking at real implementations, in 10 minutes, it gets a parallel one'. Note that |
I'm trying to make this work without #include <vector>
std::vector<int> v_a(16);
std::vector<int> v_b(16);
std::vector<int> v_c(16);
void sum(int c[], int a[], int b[])
{
#pragma loop(ivdep)
for (int i = 0; i < 16; i++)
{
c[i] = a[i] + b[i];
}
}
int main()
{
sum(v_c.data(), v_a.data(), v_b.data());
} Note that without a pragma the compilers still emits SIMD version, but goes to it conditionally at runtime. Now that this is useless: void sum(int * c, int* a, int * b)
{
std::transform(std::execution::unseq, a, a + 16, b, c, std::plus{});
} As well as this: void sum(int* c, int* a, int* b)
{
std::for_each(std::execution::unseq, c, c + 16, [=](int& v) {
v = a[&v - c] + b[&v - c];
});
} How else I can say it? |
I don't believe we have a way to say that without a pragma at this time. It's really on the optimizer team if they want to consume the unseq signal and they have declined to do so as of yet. |
Some strategies that can be used:
Just call parallel
transform
:replace_copy
replace_copy_if
Scans:
copy_if
partition_copy
remove_copy
remove_copy_if
unique
unique_copy
Same as serial
nth_element
but call the parallel partition op for large N:nth_element
Predicate tests (like
all_of
):lexicographical_compare
Summary statistics (like
find
/find_end
):min_element
max_element
minmax_element
Divide and conquer:
inplace_merge
stable_partition
Divide range1 into chunks, binary search to find matching range2 chunks, scan:
merge
set_symmetric_difference
set_union
Other:
includes
The text was updated successfully, but these errors were encountered: