-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge develop into CK_upstream (Please don't squash when merging) #1165
Conversation
* fix typo in ctest is_anabled->is_enabled * Extend timeout for retry stage w/ historical data Co-authored-by: Jun Liu <[email protected]>
* Remove the num_cu and device name from the db classes * Fix db installation and remove miopen.db * pdb fallback if cu count no match * Enable fallback for missing perf info * handle empty handle * handle string exceptions, add cu count to files. * convert function to lambda * convert function to lambda * remove unsupported embed target * update embedded db logic * guard miopen_data include * Remove unable * fix test verification error due to fft * address reivew comments: reduce logging verbosity * update perf data * vacuum files and remove old db file * fix cmake prompt help
…ESTS] Support for GFX90A_ENABLED/DISABLED. (#1110) * remove test_conv2d from skip test for gfx90a * add gfx90a to blacklist of conv_winoRxS_f2x3 * skip test_conv2d for gfx90a * use develop's ctest file * remove fix in solver. working for W/A in ctest * add W/A in winoRxS f2x3 solver
* [MLIR] Fixing WRW invoker by setting zero to weights * Set dw tensor to zero for only fp32 case
* Update rocm version to 4.3 * address return brace init list tidy check * disable navi timeout Co-authored-by: Jun Liu <[email protected]>
#1108) * add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files * make inner product compatible on gfx900 * Update src/include/miopen/solver/ck_utility_common.hpp * compiler parameter use stream * use int instead of index_t in kernel wrapper * DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element * Add dynamic generic reduction kernel layer (kernel wrappers, kernel implementations and utilities) * Some updates to dynamic composable kernel facility for the need of dynamic generic reduction * Update to generic reduction C++ host interface layer to support dynamic generic reduction * Update to remove tidy complaints in host interface layer * Change the unary operator form from void op(T &x) to T op(T x) * Update to pass single workspace pointer for all kernels (fix for OpenCL backend) * Use cppcheck-suppress to prevent some strange warnings * Re-use operator [] and () for DynamicBuffer and update to depending codes * Remove useless codes in first call threadwise/warpwise/blockwise kernel wrappers * [performance] Remove un-needed local buffer initialization Co-authored-by: Chao Liu <[email protected]> Co-authored-by: JD <[email protected]>
* make IsTunable function available from anysolver git-subtree-dir: fin git-subtree-split: 684ce5b0fc5853cb2b23213e94f4ea3ef3821745
…default for HIP backend && ROCm 4.3 (#1125) * disable dynamic reduction by default for HIP backend && ROCm 4.3 * [CI] Upgrade ROCm to 4.3.1 Co-authored-by: Jun Liu <[email protected]>
…efault for HIP backend && ROCm 4.3 (#1127)
* add gfx90a asm igemm config part * add asm igemm kernels for gfx90a * debug version * remove debug code * enable fwd and bwd asm igemm for nhwc layout * rename symbol * fix bug in fwd kernel * enable gfx90a test for asm igemm nhwc * fix error in cmake function * use GFX90a instead of GFX90A * WA 1093 disable test_conv2d * remove test_conv2d from skip test for gfx90a * re-open test_conv2d for ctest * add gfx90a to blacklist of conv_winoRxS_f2x3 * skip test_conv2d for gfx90a * fix typo Co-authored-by: Jun Liu <[email protected]>
* Fwd spatial multi-kernel bnorm solver * Scope reduction * Fixed solver ids order after merges Co-authored-by: Vasilii Filippov <[email protected]>
* Reduce number of dropout test case * define dropout single ctest
…=4x, 2x (#1114) * add support for several NHWC bwd ssd config, when k=4x, 2x * add ctest
* Implement tuning support for ConvMlirIgemmFwd * Bump MLIR commit to latest Co-authored-by: Jun Liu <[email protected]>
* Implemented the 3rd solver * format Co-authored-by: Vasilii Filippov <[email protected]>
…, when k=4x, 2x (#1136)
…n error in small cyx case (#1137) * add gfx908 fwd kernel for fix small cyx * add gfx90a fwd kernel for fix small cyx * add ctest for the failed case
…e on ROCM 4.3 (#1131) * Fix the calculation of ws_buf2_bytes_offset for dynamic reduction in src/reducetensor.cpp * Just remove IsDynamicReductionEnabled() * Tiny fix in ReduceTensorDescriptor::GetWorkspaceSize() * Update to the calculation of ws_buf2_bytes_offset
* Implement tuning support for ConvMlirIgemmFwdXdlops * Narrowing KPACK size tuning range to 4/8
* disable all asm kernels if xnack enabled * remove test_find_db, test_main, test_immed_conv2d from skip tests for gfx90a * fix clang-tidy Co-authored-by: Jun Liu <[email protected]>
* Fixed missing checkNumericsOutput in bnorm forward Co-authored-by: Vasilii Filippov <[email protected]> Co-authored-by: Jun Liu <[email protected]>
* ci-add-parameter-gpu-type(01) Revert "[TESTS][Navi21] Move Navi21 tests for release schedule (#1135)" This reverts commit 8b2f260. * Remove redundant DISABLE_ALL_STAGES parameter * ci-add-parameter-gpu-type(03) Refactor SMOKE_MIOPENTENSILE_LATEST * ci-add-parameter-gpu-type(04) Formatting * ci-add-parameter-gpu-type(05) Finally remove redundant DISABLE_ALL_STAGES * ci-add-parameter-gpu-type(06) Remove unused BUILD_PACKAGES * ci-add-parameter-gpu-type(07) Add all TARGET params. Support selection of TARGET_NOGPU * ci-add-parameter-gpu-type(08) Support selection of VEGA targets * ci-add-parameter-gpu-type(09) Support selection of TARGET_GFX908 * ci-add-parameter-gpu-type(10) Support selection of TARGET_GFX90A * ci-add-parameter-gpu-type(11) Support selection of TARGET_NAVI21 * ci-add-parameter-gpu-type(12) Re-implement [TESTS][Navi21] Move Navi21 tests for release schedule (#1135) * ci-add-parameter-gpu-type(13) Remove WORKAROUND_iGemm_936 from FP32 tests (resolve https://github.com/ROCmSoftwarePlatform/MIOpen/pull/147/files#r706276953)
* explicity disable asm solvers for gfx90a * revert unnecessary gfx90a restrictions * fix ConvBinWinogradRxS formatting
* Bnorm backward solver core * Added return and CheckNumerics to the case when invoker was found. Co-authored-by: Vasilii Filippov <[email protected]>
* tuning updates for 4.4 * [TESTS][Vega][WORKAROUND] disable test_conv_embed_db on Vega Co-authored-by: Jehandad Khan <[email protected]> Co-authored-by: Artem Tamazov <[email protected]> Co-authored-by: Jun Liu <[email protected]>
…ult perf_config (#1159) * [MLIR] Add heuristic init request to represent the default perf_config * Refactor bwd with default perf_config * Refactor fwd xdlops with default perf_config * Refactor bwd xdlops with default perf_config * Address review feedbacks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asroy this PR is more a procedure to keep consistency and history. Please be careful not to select the "default" merge option, select from the drop down and first one would be okay?
This comment has been minimized.
This comment has been minimized.
@asroy The BTW I do not see any special need to create PRs for updates of |
…n' before entering agent. Syntax fixes/workarounds. (#1162) * Squashed commit of the following: commit a062e5f Author: Artem Tamazov <[email protected]> Date: Sat Sep 18 00:15:17 2021 +0300 [CI] 3rd fix attempt commit 5a6ab2d Author: Artem Tamazov <[email protected]> Date: Fri Sep 17 18:46:00 2021 +0300 [CI] More syntax fixes commit 7546281 Author: Artem Tamazov <[email protected]> Date: Fri Sep 17 18:29:58 2021 +0300 [CI] Fix syntax error commit c4aa858 Author: Artem Tamazov <[email protected]> Date: Fri Sep 17 18:12:13 2021 +0300 [CI] Fix: evaluate TARGET param in 'when' before entering agent. commit 43ee6c4 Merge: 7bf603a 7177b7c Author: Artem Tamazov <[email protected]> Date: Thu Sep 16 00:26:16 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 # RESOLVED Conflicts: # Jenkinsfile commit 7bf603a Author: Artem Tamazov <[email protected]> Date: Wed Jun 30 21:37:56 2021 +0300 Remove gfx908 from Smoke MLIR commit abe0930 Merge: ea52135 afbcf4c Author: Artem Tamazov <[email protected]> Date: Wed Jun 30 21:20:33 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 commit ea52135 Merge: 691f4cf 7e51452 Author: Artem Tamazov <[email protected]> Date: Wed Jun 30 21:19:34 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 # RESOLVED Conflicts: # Jenkinsfile commit 691f4cf Merge: 0523a04 856055f Author: Artem Tamazov <[email protected]> Date: Fri Apr 30 01:48:05 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 # RESOLVED Conflicts: # Jenkinsfile commit 0523a04 Author: Artem Tamazov <[email protected]> Date: Mon Apr 26 17:38:17 2021 +0300 Disable SMOKE_MIOPENTENSILE_LATEST (Vega20, gfx908) commit eb5a1f3 Merge: 4d84648 312cc22 Author: Artem Tamazov <[email protected]> Date: Mon Apr 26 17:34:19 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 commit 4d84648 Merge: 05c96e7 00d5754 Author: Artem Tamazov <[email protected]> Date: Wed Apr 21 16:36:53 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 # RESOLVED Conflicts: # Jenkinsfile commit 05c96e7 Merge: 9a642d4 6dc32fe Author: Artem Tamazov <[email protected]> Date: Wed Apr 14 23:35:56 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 # RESOLVED Conflicts: # Jenkinsfile commit 9a642d4 Author: omkar kakarparthi <[email protected]> Date: Sat Apr 10 15:56:22 2021 -0500 adding back rocm-3.7 commit 4da4fcc Author: omkar kakarparthi <[email protected]> Date: Sat Apr 10 01:06:47 2021 -0500 adidng rocm-4.1 commit 7ac2301 Merge: 1a57bab 1825f34 Author: Artem Tamazov <[email protected]> Date: Thu Apr 8 00:08:48 2021 +0300 Merge branch 'develop' into wip-rocmtest-trial-vega10 # RESOLVED Conflicts: # Jenkinsfile commit 1a57bab Author: Artem Tamazov <[email protected]> Date: Mon Mar 29 23:26:29 2021 +0300 Targeted to rocmtest-trial nodes. Removed all tests except intended for vega/vega10. * [CI] HOTFIX: Add docker user to the `render` group for Ubuntu 20.04 * [CI] Add render group to the dockerfile
646fcc268 Merge pull request #47 from ROCmSoftwarePlatform/develop 6014185ac [Bug Fix] GridwiseGemm_bk0mk1_bk0nk1_mn_xdlops_v2r4 loop issue (#44) 3e9113707 Merge pull request #46 from ROCmSoftwarePlatform/miopen_downstream_all 211dae822 Merge branch 'develop' into miopen_downstream_all 5890e3007 [Composable Kernel] update develop branch code to ck_upstream d5297abae fix bug in gridwise gemm xdlops v2r3 (#45) 38a90b6ed Merge pull request #43 from ROCmSoftwarePlatform/develop c3018794b bug fix (#39) fd49ff808 add nchw atomic , nhwc and nhwc atomic method for backward weight (#30) b2dc55f82 [MIOpen Downstream] Fix Reduction Kernel (#34) b3e8d57d5 Tweak GEMM kernel (#38) 846f462bd Add VectorType support into StaticBuffer (#27) dfb80c4e3 [Enhancements] Several bugfixes and refactoring of dynamic generic reduction (#1156) 8557901d0 Merge pull request #1165 from ROCmSoftwarePlatform/develop f305bebdc Merge pull request #31 from ROCmSoftwarePlatform/miopen_downstream-dynamic_reduction_pr b725e3fc8 Merge remote-tracking branch 'origin/develop' into miopen_downstream-dynamic_reduction_pr 88833bd9a Merge pull request #32 from ROCmSoftwarePlatform/develop df0d68106 :Merge remote-tracking branch 'origin/develop' into CK_upstream f3acd2510 Add a version of Merge transform that use integerdivision and mod (#25) 19613902b GEMM driver and kernel (#29) 627d8ef35 Backward weight v4r4r2 with xdlops (#18) 10bb81106 Misc fixes (#24) 9e80cdceb [SWDEV-281541][MSRCHA-100] Implementation of Dynamic Generic Reduction (#1108) a7a758d8c GlobalAtomicAdd for fp32/int32 (#23) 9d3f634a3 Xdlops refactor fix (#22) c6f26bb48 magic division use __umulhi() (#19) 6fe3627a9 Composable kernel init integration v3 (#1097) a2ad6d353 refactor dynamic xdlops iGemm (#13) ba6f79a75 Added host_conv_wrw for verification (#15) git-subtree-dir: src/composable_kernel git-subtree-split: 646fcc268ede841a16cdaafb68aa64803d8390e1
CK_upstream
is the branch for sync between MIOpen and CK.Please don't squash when merging
develop
intoCK_upstream
, to keep the commit history simple