`--fp8_base` breaks `--save_state`. Requires updated safetensors package. #1078

feffy380 · 2024-01-25T18:07:06Z

When using --fp8_base --save_state with both train_network scripts (SD and SDXL), saving state crashes. The log shows safetensors is crashing due to a failed dict lookup for the dtype size.
Updating safetensors to the latest version fixed the issue. 0.4.2 as of writing

saving last state.
Traceback (most recent call last):
  File "/home/hope/src/sd/sd-scripts/train_network.py", line 1062, in <module>
    trainer.train(args)
  File "/home/hope/src/sd/sd-scripts/train_network.py", line 964, in train
    train_util.save_state_on_train_end(args, accelerator)
  File "/home/hope/src/sd/sd-scripts/library/train_util.py", line 4573, in save_state_on_train_end
    accelerator.save_state(state_dir)
  File "/home/hope/src/sd/sd-scripts/venv/lib/python3.11/site-packages/accelerate/accelerator.py", line 2708, in save_state
    save_location = save_accelerator_state(
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hope/src/sd/sd-scripts/venv/lib/python3.11/site-packages/accelerate/checkpointing.py", line 99, in save_accelerator_state
    save(state, output_model_file, save_on_each_node=save_on_each_node, safe_serialization=safe_serialization)
  File "/home/hope/src/sd/sd-scripts/venv/lib/python3.11/site-packages/accelerate/utils/other.py", line 181, in save
    save_func(obj, f)
  File "/home/hope/src/sd/sd-scripts/venv/lib/python3.11/site-packages/safetensors/torch.py", line 232, in save_file
    serialize_file(_flatten(tensors), filename, metadata=metadata)
                   ^^^^^^^^^^^^^^^^^
  File "/home/hope/src/sd/sd-scripts/venv/lib/python3.11/site-packages/safetensors/torch.py", line 402, in _flatten
    return {
           ^
  File "/home/hope/src/sd/sd-scripts/venv/lib/python3.11/site-packages/safetensors/torch.py", line 406, in <dictcomp>
    "data": _tobytes(v, k),
            ^^^^^^^^^^^^^^
  File "/home/hope/src/sd/sd-scripts/venv/lib/python3.11/site-packages/safetensors/torch.py", line 362, in _tobytes
    bytes_per_item = _SIZE[tensor.dtype]
                     ~~~~~^^^^^^^^^^^^^^
KeyError: torch.float8_e4m3fn

The text was updated successfully, but these errors were encountered:

feffy380 mentioned this issue Jan 25, 2024

Update safetensors to fix a crash with --fp8_base --save_state #1079

Merged

feffy380 closed this as completed Jan 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`--fp8_base` breaks `--save_state`. Requires updated safetensors package. #1078

`--fp8_base` breaks `--save_state`. Requires updated safetensors package. #1078

feffy380 commented Jan 25, 2024 •

edited

Loading

--fp8_base breaks --save_state. Requires updated safetensors package. #1078

--fp8_base breaks --save_state. Requires updated safetensors package. #1078

Comments

feffy380 commented Jan 25, 2024 • edited Loading

`--fp8_base` breaks `--save_state`. Requires updated safetensors package. #1078

`--fp8_base` breaks `--save_state`. Requires updated safetensors package. #1078

feffy380 commented Jan 25, 2024 •

edited

Loading