Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Use multi-tensor sumSQ in clip_global_norm #17652

Merged

Conversation

MoisesHer
Copy link
Contributor

Description

Using multi-tensor sum of squares in gluon: clip_global_norm.
Instead of computing the sum of squares of each input array sequentially, compute them in parallel (multi-tensor).

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • gluon: clip_global_norm was modified to use mxnet.nd.multi_sum_sq op and compute the sumSq of several arrays in parallel
  • Test in tests/python/gpu/test_gluon_gpu:test_global_norm_clip_multi_device was extended to have 2 arrays on gpu(0) and 2 arrays on cpu(0), so that multi-tensor sumSq is tested for each of these contexts

@MoisesHer MoisesHer requested a review from szha as a code owner February 21, 2020 10:56
@eric-haibin-lin eric-haibin-lin merged commit b133899 into apache:master Mar 23, 2020
anirudh2290 added a commit to anirudh2290/mxnet that referenced this pull request Mar 27, 2020
* 'master' of https://github.com/apache/incubator-mxnet: (192 commits)
  * impl - FFI for np einsum (apache#17869)
  [Numpy] FFI for diag/diagonal/diag_indices_from (apache#17789)
  [Numpy] Kron operator (apache#17323)
  cmake: Set DMLC_LOG_FATAL_THROW only for building mxnet and not for tvm (apache#17878)
  Add simplified HybridBlock.forward without F (apache#17530)
  Use FP32 copy of weights for norm (multitensor LAMB optimizer) (apache#17700)
  Use multi-tensor sumSQ in clip_global_norm (apache#17652)
  [Numpy] Add op fmax, fmin, fmod (apache#17567)
  Adding sparse support to MXTensor for custom operators (apache#17569)
  Update 3rdparty/mkldnn to v1.2.2 (apache#17313)
  Dynamic subgraph compile support (apache#17623)
  Refactor cpp-package CMakeLists.txt & add missing inference/imagenet_inference (apache#17835)
  staticbuild: Fix potential user-assisted execution of arbitrary code  (apache#17860)
  * FFI for np.argmax and np.argmin (apache#17843)
  ffi for roll/rot90 (apache#17861)
  Skip test_multi_worker_dataloader_release_pool on OS X (apache#17797)
  add ffi for full_like, binary (apache#17811)
  HybridBlock.export() to return created filenames (apache#17758)
  Fix SoftReLU fused operator numerical stability (apache#17849)
  CI: Test clang10 cpu & gpu builds with -WError (apache#17830)
  ...
MoisesHer added a commit to MoisesHer/incubator-mxnet that referenced this pull request Apr 10, 2020
* Use multi-tensor sumSQ in clip_global_norm

* fix pylint
anirudh2290 pushed a commit to anirudh2290/mxnet that referenced this pull request May 29, 2020
* Use multi-tensor sumSQ in clip_global_norm

* fix pylint
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants