While LORA training: permission denied for '/home/appuser/.cache/huggingface #859

andrey-lepekhin · 2023-05-27T16:46:38Z

While trying to train a LORA, there's permission error:
PermissionError: [Errno 13] Permission denied: '/home/appuser/.cache/huggingface'

Full trace

accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="/dataset/results_innop/img" --reg_data_dir="/dataset/results_innop/reg" --resolution=512,512 --output_dir="/dataset/results_innop/model" --logging_dir="/dataset/results_innop/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-05 --unet_lr=0.0001 --network_dim=128 --output_name="innop_v1" --lr_scheduler_num_cycles="15" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="2268" --train_batch_size="1" --max_train_steps="22680" --save_every_n_epochs="3" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --max_data_loader_n_workers="0" --bucket_reso_steps=64 --xformers --bucket_no_upscale
2023-05-27 17:34:51.258871: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-27 17:34:52.500315: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-05-27 17:34:53.085678: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
There was a problem when trying to write in your cache folder (/home/appuser/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
2023-05-27 17:35:02.229659: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-27 17:35:02.403447: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-05-27 17:35:02.466345: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
There was a problem when trying to write in your cache folder (/home/appuser/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
prepare tokenizer


Traceback (most recent call last):
  File "/app/train_network.py", line 783, in <module>
    train(args)
  File "/app/train_network.py", line 78, in train
    tokenizer = train_util.load_tokenizer(args)
  File "/app/library/train_util.py", line 2902, in load_tokenizer
    tokenizer = CLIPTokenizer.from_pretrained(original_path)
  File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 1763, in from_pretrained
    resolved_vocab_files[file_id] = cached_file(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 409, in cached_file
    resolved_file = hf_hub_download(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1131, in hf_hub_download
    os.makedirs(storage_folder, exist_ok=True)
  File "/usr/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/usr/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/usr/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/appuser/.cache/huggingface'

The text was updated successfully, but these errors were encountered:

andrey-lepekhin · 2023-05-27T16:47:58Z

Manually creating .cache/huggingface on host and adding it to docker-compose.yml solves the problem

Add a writable cache directory for huggingface. Fixes bmaltais#859

andrey-lepekhin added a commit to andrey-lepekhin/kohya_ss that referenced this issue May 27, 2023

Update docker-compose.yaml

fac8759

Add a writable cache directory for huggingface. Fixes bmaltais#859

bmaltais pushed a commit that referenced this issue Oct 10, 2023

may work dropout in LyCORIS #859

025368f

bmaltais closed this as completed Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

While LORA training: permission denied for '/home/appuser/.cache/huggingface #859

While LORA training: permission denied for '/home/appuser/.cache/huggingface #859

andrey-lepekhin commented May 27, 2023

andrey-lepekhin commented May 27, 2023

While LORA training: permission denied for '/home/appuser/.cache/huggingface #859

While LORA training: permission denied for '/home/appuser/.cache/huggingface #859

Comments

andrey-lepekhin commented May 27, 2023

andrey-lepekhin commented May 27, 2023