-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduction of release notes and applied resolved comments in #2052 #2068
Changes from all commits
4a3646a
e6e872c
466edc5
9df1dfc
9ff8b65
b1005e3
dab51ea
5708ebb
ee1a0e2
4d43cee
392bd4d
30d8f00
9f10b1b
14ee429
b8cfe88
9293ccf
3aad0df
43af7b7
beef7f6
a2b79bd
2274e8b
82d71f3
7dbfcc7
d762113
90ae6ef
46fc709
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,68 @@ The Intel® oneAPI DPC++ Library (oneDPL) accompanies the Intel® oneAPI DPC++/C | |
and provides high-productivity APIs aimed to minimize programming efforts of C++ developers | ||
creating efficient heterogeneous applications. | ||
|
||
New in 2022.8.0 | ||
=============== | ||
|
||
New Features | ||
------------ | ||
- Added support of host policies for ``histogram`` algorithms. | ||
- Added support for an undersized output range in the range-based ``merge`` algorithm. | ||
- Improved performance of the ``merge`` and sorting algorithms | ||
(``sort``, ``stable_sort``, ``sort_by_key``, ``stable_sort_by_key``) that rely on Merge sort [#fnote1]_, | ||
with device policies for large data sizes. | ||
- Improved performance of ``copy``, ``fill``, ``for_each``, ``replace``, ``reverse``, ``rotate``, ``transform`` and 30+ | ||
other algorithms with device policies on GPUs. | ||
- Improved oneDPL use with SYCL implementations other than Intel oneDPI DPC++/C++ compiler. | ||
|
||
|
||
Fixed Issues | ||
akukanov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
------------ | ||
- Fixed an issue with ``drop_view`` in the experimental range-based API. | ||
akukanov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Fixed compilation errors in ``find_if`` and ``find_if_not`` with device policies where the user provided predicate is | ||
device copyable but not trivially copyable. | ||
- Fixed incorrect results or synchronous SYCL exceptions for several algorithms when compiled with ``-O0`` and executed | ||
on a GPU device. | ||
- Fixed an issue preventing inclusion of the ``<numeric>`` header after ``<execution>`` and ``<algorithm>`` headers. | ||
- Fixed several issues in the ``sort``, ``stable_sort``, ``sort_by_key`` and ``stable_sort_by_key`` algorithms that: | ||
|
||
* Allows the use of non-trivially-copyable comparators. | ||
* Eliminates duplicate kernel names | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add a period at the end of this bullet. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
* Resolves incorrect results on devices with sub-group sizes smaller than four. | ||
* Resolved synchronization errors that were seen on Intel® Arc™ B-series GPU devices. | ||
|
||
Known Issues and Limitations | ||
---------------------------- | ||
New in This Release | ||
^^^^^^^^^^^^^^^^^^^ | ||
- Incorrect results may be observed when calling ``sort`` with a device policy on Intel® Arc™ Graphics 140V with data | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This instance of the name calls for a lowercase "g" in "graphics". Please update the name to Intel® Arc™ graphics. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
sizes of 4-8 million elements. | ||
- ``sort``, ``stable_sort``, ``sort_by_key`` and ``stable_sort_by_key`` algorithms fail to compile | ||
when using Clang 17 and earlier versions, as well as compilers based on these versions, | ||
such as Intel(R) oneAPI DPC++/C++ Compiler 2023.2.0. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please change the (R) to ® for consistency. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
- When compiling code that uses device policies with the open source oneAPI DPC++ Compiler (clang++ driver), | ||
synchronous SYCL runtime exceptions regarding unfound kernels may be encountered unless an optimization flag is | ||
specified (e.g. ``-O1``) as opposed to relying on the compiler's default optimization level. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please change "e.g." to "for example". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
timmiesmith marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Existing Issues | ||
^^^^^^^^^^^^^^^ | ||
See oneDPL Guide for other `restrictions and known limitations`_. | ||
|
||
- ``histogram`` algorithm requires the output value type to be an integral type no larger than four bytes | ||
when used with an FPGA policy. | ||
- ``histogram`` may provide incorrect results with device policies in a program built with ``-O0`` option. | ||
- Compilation issues may be encountered when passing zip iterators to ``exclusive_scan_by_segment`` on Windows. | ||
timmiesmith marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- For ``transform_exclusive_scan`` and ``exclusive_scan`` to run in-place (that is, with the same data | ||
used for both input and destination) and with an execution policy of ``unseq`` or ``par_unseq``, | ||
it is required that the provided input and destination iterators are equality comparable. | ||
Furthermore, the equality comparison of the input and destination iterator must evaluate to true. | ||
If these conditions are not met, the result of these algorithm calls is undefined. | ||
- Incorrect results may be produced by ``exclusive_scan``, ``inclusive_scan``, ``transform_exclusive_scan``, | ||
``transform_inclusive_scan``, ``exclusive_scan_by_segment``, ``inclusive_scan_by_segment``, ``reduce_by_segment`` | ||
with ``unseq`` or ``par_unseq`` policy when compiled by Intel® oneAPI DPC++/C++ Compiler | ||
with ``-fiopenmp``, ``-fiopenmp-simd``, ``-qopenmp``, ``-qopenmp-simd`` options on Linux. | ||
To avoid the issue, pass ``-fopenmp`` or ``-fopenmp-simd`` option instead. | ||
|
||
New in 2022.7.0 | ||
=============== | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the compiler name: Intel® oneAPI DPC++/C++ Compiler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.