You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having some errors during running the stage 1:
[rank1]: Traceback (most recent call last):
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train_mem.py", line 7, in
[rank1]: train()
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 2081, in train
[rank1]: trainer.train()
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/trainer.py", line 1859, in train
[rank1]: return inner_training_loop(
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/trainer.py", line 2165, in _inner_training_loop
[rank1]: for step, inputs in enumerate(epoch_iterator):
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/accelerate/data_loader.py", line 454, in iter
[rank1]: current_batch = next(dataloader_iter)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in next
[rank1]: data = self._next_data()
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
[rank1]: return self._process_data(data)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
[rank1]: data.reraise()
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/_utils.py", line 706, in reraise
[rank1]: raise exception
[rank1]: ValueError: Caught ValueError in DataLoader worker process 0.
[rank1]: Original Traceback (most recent call last):
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 182, in convert_to_tensors
[rank1]: tensor = as_tensor(value)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 141, in as_tensor
[rank1]: return torch.tensor(value)
[rank1]: RuntimeError: Could not infer dtype of numpy.float32
[rank1]: During handling of the above exception, another exception occurred:
[rank1]: Traceback (most recent call last):
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
[rank1]: data = fetcher.fetch(index) # type: ignore[possibly-undefined]
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
[rank1]: data = [self.dataset[idx] for idx in possibly_batched_index]
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 52, in
[rank1]: data = [self.dataset[idx] for idx in possibly_batched_index]
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1446, in getitem
[rank1]: raise e
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1443, in getitem
[rank1]: sample = self._get_item(i)
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1499, in _get_item
[rank1]: raise e
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1490, in _get_item
[rank1]: image = processor.preprocess(video, return_tensors="pt")["pixel_values"]
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/model/multimodal_encoder/umt_encoder.py", line 66, in preprocess
[rank1]: return BatchFeature(data=data, tensor_type=return_tensors)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 78, in init
[rank1]: self.convert_to_tensors(tensor_type=tensor_type)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 188, in convert_to_tensors
[rank1]: raise ValueError(
[rank1]: ValueError: Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length.
The error was raised in umt_encoder.py, line 64: images = reduce(lambda x, f: [*map(f, x)], transforms, images)
When the input is a video ( i think), which is len(images) is greater than 1, the error happened.
It is probably easy to debug BUT you probably didnt have this problem in your machine, right?
And also, I have infinite conflicts when installing packages according to the requirement.txt. And i don't think I do all as what you provide in the requirements.txt.
The text was updated successfully, but these errors were encountered:
You could try pip some import packages like numpy torch .... by yourself rather than pip install -r requirement.txt, could you inference our model in huggingface? If possible, I think the training only needs to modify the a few packages.
You could try pip some import packages like numpy torch .... by yourself rather than pip install -r requirement.txt, could you inference our model in huggingface? If possible, I think the training only needs to modify the a few packages.
And I think the most possible reason is numpy or transformers, because the error is from processor.preprocess.
Hi there! Thanks for your effort!
I'm having some errors during running the stage 1:
[rank1]: Traceback (most recent call last):
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train_mem.py", line 7, in
[rank1]: train()
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 2081, in train
[rank1]: trainer.train()
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/trainer.py", line 1859, in train
[rank1]: return inner_training_loop(
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/trainer.py", line 2165, in _inner_training_loop
[rank1]: for step, inputs in enumerate(epoch_iterator):
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/accelerate/data_loader.py", line 454, in iter
[rank1]: current_batch = next(dataloader_iter)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in next
[rank1]: data = self._next_data()
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
[rank1]: return self._process_data(data)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
[rank1]: data.reraise()
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/_utils.py", line 706, in reraise
[rank1]: raise exception
[rank1]: ValueError: Caught ValueError in DataLoader worker process 0.
[rank1]: Original Traceback (most recent call last):
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 182, in convert_to_tensors
[rank1]: tensor = as_tensor(value)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 141, in as_tensor
[rank1]: return torch.tensor(value)
[rank1]: RuntimeError: Could not infer dtype of numpy.float32
[rank1]: During handling of the above exception, another exception occurred:
[rank1]: Traceback (most recent call last):
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
[rank1]: data = fetcher.fetch(index) # type: ignore[possibly-undefined]
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
[rank1]: data = [self.dataset[idx] for idx in possibly_batched_index]
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 52, in
[rank1]: data = [self.dataset[idx] for idx in possibly_batched_index]
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1446, in getitem
[rank1]: raise e
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1443, in getitem
[rank1]: sample = self._get_item(i)
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1499, in _get_item
[rank1]: raise e
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/train/train.py", line 1490, in _get_item
[rank1]: image = processor.preprocess(video, return_tensors="pt")["pixel_values"]
[rank1]: File "/root/Forward-Forward/llava-train_videochat/llava/model/multimodal_encoder/umt_encoder.py", line 66, in preprocess
[rank1]: return BatchFeature(data=data, tensor_type=return_tensors)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 78, in init
[rank1]: self.convert_to_tensors(tensor_type=tensor_type)
[rank1]: File "/opt/conda/envs/FForward/lib/python3.9/site-packages/transformers/feature_extraction_utils.py", line 188, in convert_to_tensors
[rank1]: raise ValueError(
[rank1]: ValueError: Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length.
The error was raised in umt_encoder.py, line 64: images = reduce(lambda x, f: [*map(f, x)], transforms, images)
When the input is a video ( i think), which is len(images) is greater than 1, the error happened.
It is probably easy to debug BUT you probably didnt have this problem in your machine, right?
And also, I have infinite conflicts when installing packages according to the requirement.txt. And i don't think I do all as what you provide in the requirements.txt.
The text was updated successfully, but these errors were encountered: