Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: index 0 is out of bounds for dimension 0 with size 0 for SOP Dataset #9

Open
Alihan-Inc opened this issue May 14, 2024 · 0 comments

Comments

@Alihan-Inc
Copy link

I encountered an error while running the experiment with the following configuration:

experience.experiment_name=HAPPIER_SOP
experience.log_dir=experiments/HAPPIER
experience.seed=0
experience.max_iter=100
experience.warmup_step=5
experience.accuracy_calculator.compute_for_hierarchy_levels=[0,1]
optimizer=sop
model=resnet_ln
transform=sop
dataset=sop
loss=HAPPIER_SOP

The error occurred during the evaluation phase at the 20th epoch. I ran the experiment three times without making any changes to the code, and the error persisted.

Environment Details:

Platform: Snellius server
Torch version: 1.8.1 with CUDA 11.1
Error Message:
IndexError: index 0 is out of bounds for dimension 0 with size 0

Explanation:
The error appears to be caused by an IndexError during data loading in the DataLoader worker process. Specifically, it seems that the index 0 is out of bounds for dimension 0 with size 0 when accessing data from the dataset.

Please advise on how to resolve this issue or if any additional information is needed.

Here is the error:
[�[36m2024-05-14 13:59:07,784�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Training : @epoch #20 for model HAPPIER_SOP�[0m
[�[36m2024-05-14 13:59:07,785�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Shuffling data�[0m
[�[36m2024-05-14 13:59:10,006�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Shuffling data�[0m
[�[36m2024-05-14 14:03:03,794�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Evaluating for epoch 20�[0m
[�[36m2024-05-14 14:03:03,797�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Getting embeddings for the test set�[0m
[�[36m2024-05-14 14:03:03,798�[0m][�[34mHAPPIER�[0m][�[32mINFO�[0m] - Computing embeddings�[0m
Error executing job with overrides: ['experience.experiment_name=HAPPIER_SOP', 'experience.log_dir=experiments/HAPPIER', 'experience.seed=0', 'experience.max_iter=100', 'experience.warmup_step=5', 'experience.accuracy_calculator.compute_for_hierarchy_levels=[0,1]', 'optimizer=sop', 'model=resnet_ln', 'transform=sop', 'dataset=sop', 'loss=HAPPIER_SOP']
Traceback (most recent call last):
File "/home/scur2061/hyperhap/HAPPIER/happier/run.py", line 158, in run
return eng.train(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/train.py", line 89, in train
metrics = evaluate(
File "/home/scur2061/hyperhap/HAPPIER/happier/lib/get_set_random_state.py", line 43, in wrapper
output = func(*args, **kwargs)
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 218, in evaluate
return acc.evaluate(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 138, in evaluate
features, labels, relevances = self.get_embeddings(net, dts)
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/accuracy_calculator.py", line 60, in get_embeddings
return compute_embeddings(
File "/home/scur2061/hyperhap/HAPPIER/happier/engine/compute_embeddings.py", line 19, in compute_embeddings
for i, batch in enumerate(tqdm(loader, disable=os.getenv("TQDM_DISABLE"))):
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/tqdm/std.py", line 1183, in iter
for obj in iterable:
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
data = fetcher.fetch(index)
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/scur2061/.conda/envs/happier_env_111/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/scur2061/hyperhap/HAPPIER/happier/datasets/base_dataset.py", line 118, in getitem
relevances = self.relevances[idx, :]
IndexError: index 0 is out of bounds for dimension 0 with size 0

In the job script, I use the following modules loads:

module load 2021
module load cuDNN/8.2.1.32-CUDA-11.3.1
module load 2022
module load Anaconda3/2022.05

and this is how I created the environment:
conda create --name happier
conda activate happier
conda install python=3.9
conda install -c conda-forge cudatoolkit=11.1
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

the requirements.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant