NameError: name 'DeepSpeedCPUAdam' is not defined #488

lipiji · 2020-10-27T16:11:52Z

I have installed the cpu-adam, but still met the following issue:

Successfully installed deepspeed-0.3.0+d720fdb
Removed build tracker: '/tmp/pip-req-tracker-iqs3vipd'
[SUCCESS] deepspeed successfully imported.
[INFO] torch install path: ['/dockerdata/xx/anaconda3/lib/python3.7/site-packages/torch']
[INFO] torch version: 1.5.1+cu101, torch.cuda: 10.1
[INFO] deepspeed install path: ['/dockerdata/xx/anaconda3/lib/python3.7/site-packages/deepspeed']
[INFO] deepspeed info: 0.3.0+d720fdb, d720fdb, master
[SUCCESS] apex extensions successfully installed
[INFO] using new-style apex
[SUCCESS] fused lamb successfully installed.
[SUCCESS] transformer kernels successfully installed.
[WARNING] sparse attention is NOT installed.
[SUCCESS] cpu-adam (used by ZeRO-offload) successfully installed.
Installation is successful

=======

And when I run this script:
sh ./scripts/ds_zero-offload_10B_pretrain_gpt2_model_parallel.sh

...python3.7/site-packages/deepspeed/runtime/zero/stage2.py", line 161, in init
and type(init_optimizer) == DeepSpeedCPUAdam)
NameError: name 'DeepSpeedCPUAdam' is not defined
Adam Optimizer #0 is created with AVX512 arithmetic capability.
Optimizer = DeepSpeedCPUAdam
Checking ZeRO support for optimizer=DeepSpeedCPUAdam type=<class 'deepspeed.ops.adam.cpu_adam.DeepSpeedCPUAdam'>
[2020-10-28 00:06:32,670] [INFO] [engine.py:613:_configure_zero_optimizer] Creating fp16 ZeRO stage 2 optimizer

The text was updated successfully, but these errors were encountered:

lipiji · 2020-10-28T07:50:06Z

It seems a bug that an unimported Class is used:
https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/zero/stage2.py#L161

jeffra · 2020-10-28T22:08:12Z

Hi @lipiji, so sorry you're running into this issue. We have since fixed this issue but the PR (#476) isn't merged yet (it should be merged by tomorrow). Feel free to try it out before it's merged if you'd like. Check out the notes at the top of the PR there are a lot of changes coming in the install process.

linehammer · 2021-07-06T05:50:57Z

Python knows the purposes of certain names (ex. built-in functions ). Other names are defined within the program (ex. variables). If Python encounters a name that it doesn't recognize, you'll probably get NameError: global name 'xx' is not defined error. In most cases, this error is triggered when Python sees a variable name (Global or Local) and doesn't know what it's for. These errors can happen if you forget to initialize a variable , if you misspell a variable, or if you misspell a reserved word such as "True". Before you use the global variable in your function for reading, it must be first initialized somewhere: either outside of the function or inside it.

* Merge chatgpt v2 to v3 - finalized (#484) * [squash] staging chatgpt v1 (#463) Co-authored-by: Reza Yazdani <[email protected]> Co-authored-by: yaozhewei <[email protected]> Co-authored-by: Tunji Ruwase <[email protected]> * [partial] formatting fixes * quantizer fixes * fix for bert tests * formatting fixes * re-enable _param_slice_mappings in z2 * Enable the QKV requires_grad when in training mode (#466) Co-authored-by: Jeff Rasley <[email protected]> * fixes for attention enable_training flag * commit to trigger CI * fix for distil-bert param * fixes for training context errors * remove reza's qkv-optimization (#469) Co-authored-by: Jeff Rasley <[email protected]> * Chatgpt - Fuse lora params at HybridEngine (#472) Co-authored-by: Jeff Rasley <[email protected]> * add option to enable non-pin mode (#473) * Chatgpt - fuse lora non pinned case (#474) * Fix fuse/unfuse lora for Z3 and non-pinned parameter * unfuse_lora_weight for non-pinned case * fix the multiple issue for lora parameters * formatting * fuse lora only when available --------- Co-authored-by: Jeff Rasley <[email protected]> * Chatgpt/release inference cache (#475) * Fix fuse/unfuse lora for Z3 and non-pinned parameter * unfuse_lora_weight for non-pinned case * release/retake the inference cache after/before generate * remove duplicated _fuse_lora function * fix formatting * fix hybrid-engine config issue * update formatting * Chatgpt - fuse qkv v2 (#478) Co-authored-by: Jeff Rasley <[email protected]> * ChatGPT: Refactor Hybrid Engine Config (#477) Co-authored-by: Lok Chand Koppaka <[email protected]> * Inference Workspace Tweaks (#481) * Safety checks around inference workspace allocation, extra flushing * Formatting fixes * Merge fix * Chatgpt/inference tp (#480) * Update the merged-QKV weights only if there is difference with the model parameter * remove the hard-coded size * always reset qkv params to updated ones after running step * Add the infernce-tp group and tensor sharding to run inference in model-parallel mode * optimize the gather/mp-sharding part * Add hybrid_engine changes * fix config issue * Formatting fixes. Reset_qkv duplicate removal. * fix bloom container. * fix format. --------- Co-authored-by: Ammar Ahmad Awan <[email protected]> Co-authored-by: Lok Chand Koppaka <[email protected]> * fix formatting * more clean-up --------- Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: yaozhewei <[email protected]> Co-authored-by: Tunji Ruwase <[email protected]> Co-authored-by: Masahiro Tanaka <[email protected]> Co-authored-by: Michael Wyatt <[email protected]> Co-authored-by: Lok Chand Koppaka <[email protected]> Co-authored-by: Connor Holmes <[email protected]> Co-authored-by: Ammar Ahmad Awan <[email protected]> * fix a bug on lora-fusion (#487) * Cholmes/v3 workspace bugfixes (#488) * Miscellaneous workspace fixes, new config param * Fix typo --------- Co-authored-by: Reza Yazdani <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: yaozhewei <[email protected]> Co-authored-by: Tunji Ruwase <[email protected]> Co-authored-by: Masahiro Tanaka <[email protected]> Co-authored-by: Michael Wyatt <[email protected]> Co-authored-by: Lok Chand Koppaka <[email protected]> Co-authored-by: Connor Holmes <[email protected]>

lipiji closed this as completed Oct 28, 2020

lipiji reopened this Oct 28, 2020

lipiji closed this as completed Oct 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NameError: name 'DeepSpeedCPUAdam' is not defined #488

NameError: name 'DeepSpeedCPUAdam' is not defined #488

lipiji commented Oct 27, 2020 •

edited

Loading

lipiji commented Oct 28, 2020

jeffra commented Oct 28, 2020

linehammer commented Jul 6, 2021

NameError: name 'DeepSpeedCPUAdam' is not defined #488

NameError: name 'DeepSpeedCPUAdam' is not defined #488

Comments

lipiji commented Oct 27, 2020 • edited Loading

lipiji commented Oct 28, 2020

jeffra commented Oct 28, 2020

linehammer commented Jul 6, 2021

lipiji commented Oct 27, 2020 •

edited

Loading