You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The error occurred during the evaluation phase at the 20th epoch. I ran the experiment three times without making any changes to the code, and the error persisted.
Environment Details:
Platform: Snellius server
Torch version: 1.8.1 with CUDA 11.1
Error Message:
IndexError: index 0 is out of bounds for dimension 0 with size 0
Explanation:
The error appears to be caused by an IndexError during data loading in the DataLoader worker process. Specifically, it seems that the index 0 is out of bounds for dimension 0 with size 0 when accessing data from the dataset.
Please advise on how to resolve this issue or if any additional information is needed.
Here is the error:
[�[36m2024-05-14 13:59:07,784�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Training : @epoch #20 for model HAPPIER_SOP�[0m
[�[36m2024-05-14 13:59:07,785�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Shuffling data�[0m
[�[36m2024-05-14 13:59:10,006�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Shuffling data�[0m
[�[36m2024-05-14 14:03:03,794�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Evaluating for epoch 20�[0m
[�[36m2024-05-14 14:03:03,797�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Getting embeddings for the test set�[0m
[�[36m2024-05-14 14:03:03,798�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Computing embeddings�[0m
Error executing job with overrides: ['experience.experiment_name=HAPPIER_SOP', 'experience.log_dir=experiments/HAPPIER', 'experience.seed=0', 'experience.max_iter=100', 'experience.warmup_step=5', 'experience.accuracy_calculator.compute_for_hierarchy_levels=[0,1]', 'optimizer=sop', 'model=resnet_ln', 'transform=sop', 'dataset=sop', 'loss=HAPPIER_SOP']
Traceback (most recent call last):
File "/home/scur2061/hyperhap/HAPPIER/happier/run.py", line 158, in run
return eng.train(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/train.py", line 89, in train
metrics = evaluate(
File "/home/scur2061/hyperhap/HAPPIER/happier/lib/get_set_random_state.py", line 43, in wrapper
output = func(*args, **kwargs)
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 218, in evaluate
return acc.evaluate(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 138, in evaluate
features, labels, relevances = self.get_embeddings(net, dts)
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 60, in get_embeddings
return compute_embeddings(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/compute_embeddings.py", line 19, in compute_embeddings
for i, batch in enumerate(tqdm(loader, disable=os.getenv("TQDM_DISABLE"))):
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/tqdm/std.py", line 1183, in iter
for obj in iterable:
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/scur2061/hyperhap/HAPPIER/happier/datasets/base_dataset.py", line 118, in getitem
relevances = self.relevances[idx, :]
IndexError: index 0 is out of bounds for dimension 0 with size 0
In the job script, I use the following modules loads:
and this is how I created the environment:
conda create --name happier
conda activate happier
conda install python=3.9
conda install -c conda-forge cudatoolkit=11.1
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
the requirements.txt
The text was updated successfully, but these errors were encountered:
I encountered an error while running the experiment with the following configuration:
experience.experiment_name=HAPPIER_SOP
experience.log_dir=experiments/HAPPIER
experience.seed=0
experience.max_iter=100
experience.warmup_step=5
experience.accuracy_calculator.compute_for_hierarchy_levels=[0,1]
optimizer=sop
model=resnet_ln
transform=sop
dataset=sop
loss=HAPPIER_SOP
The error occurred during the evaluation phase at the 20th epoch. I ran the experiment three times without making any changes to the code, and the error persisted.
Environment Details:
Platform: Snellius server
Torch version: 1.8.1 with CUDA 11.1
Error Message:
IndexError: index 0 is out of bounds for dimension 0 with size 0
Explanation:
The error appears to be caused by an IndexError during data loading in the DataLoader worker process. Specifically, it seems that the index 0 is out of bounds for dimension 0 with size 0 when accessing data from the dataset.
Please advise on how to resolve this issue or if any additional information is needed.
Here is the error:
[�[36m2024-05-14 13:59:07,784�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Training : @epoch #20 for model HAPPIER_SOP�[0m
[�[36m2024-05-14 13:59:07,785�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Shuffling data�[0m
[�[36m2024-05-14 13:59:10,006�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Shuffling data�[0m
[�[36m2024-05-14 14:03:03,794�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Evaluating for epoch 20�[0m
[�[36m2024-05-14 14:03:03,797�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Getting embeddings for the test set�[0m
[�[36m2024-05-14 14:03:03,798�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Computing embeddings�[0m
Error executing job with overrides: ['experience.experiment_name=HAPPIER_SOP', 'experience.log_dir=experiments/HAPPIER', 'experience.seed=0', 'experience.max_iter=100', 'experience.warmup_step=5', 'experience.accuracy_calculator.compute_for_hierarchy_levels=[0,1]', 'optimizer=sop', 'model=resnet_ln', 'transform=sop', 'dataset=sop', 'loss=HAPPIER_SOP']
Traceback (most recent call last):
File "/home/scur2061/hyperhap/HAPPIER/happier/run.py", line 158, in run
return eng.train(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/train.py", line 89, in train
metrics = evaluate(
File "/home/scur2061/hyperhap/HAPPIER/happier/lib/get_set_random_state.py", line 43, in wrapper
output = func(*args, **kwargs)
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 218, in evaluate
return acc.evaluate(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 138, in evaluate
features, labels, relevances = self.get_embeddings(net, dts)
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 60, in get_embeddings
return compute_embeddings(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/compute_embeddings.py", line 19, in compute_embeddings
for i, batch in enumerate(tqdm(loader, disable=os.getenv("TQDM_DISABLE"))):
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/tqdm/std.py", line 1183, in iter
for obj in iterable:
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/scur2061/hyperhap/HAPPIER/happier/datasets/base_dataset.py", line 118, in getitem
relevances = self.relevances[idx, :]
IndexError: index 0 is out of bounds for dimension 0 with size 0
In the job script, I use the following modules loads:
module load 2021
module load cuDNN/8.2.1.32-CUDA-11.3.1
module load 2022
module load Anaconda3/2022.05
and this is how I created the environment:
conda create --name happier
conda activate happier
conda install python=3.9
conda install -c conda-forge cudatoolkit=11.1
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
the requirements.txt
The text was updated successfully, but these errors were encountered: