Retrieve the moments(start and end timestamps) from the videos given sentence query. The paper is accepted in 2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) in the following Link. We appreciate the contribution of the following code. The checkpoints for both datasets can be downloaded from the following drive.
Use the following commands for training:
# For ActivityNet Captions
python moment_localization/train.py --cfg experiments/activitynet/MSAT-32.yaml --verbose
# For TACoS
python moment_localization/train.py --cfg experiments/tacos/MSAT-128.yaml --verbose
Use the following commands for testing and replication of results:
# For ActivityNet Captions
python moment_localization/test.py --cfg experiments/activitynet/MSAT-32.yaml --verbose --split test
# For TACoS
python moment_localization/test.py --cfg experiments/tacos/MSAT-128.yaml --verbose --split test
Use the following commands for inference:
# For ActivityNet Captions
python moment_localization/inference_activitynet.py --cfg experiments/activitynet/MSAT-32.yaml --verbose
# For TACoS
python moment_localization/inference_tacos.py --cfg experiments/tacos/MSAT-128.yaml --verbose
The testing results is found to be better for activitynet captions compared to what is mentioned on the original paper. Likewise, we also updated the checkpoint for TACOS datasets.
If any part of our paper and code is helpful to your work, please generously cite with:
@inproceedings{panta2024cross,
title={Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval},
author={Panta, Love and Shrestha, Prashant and Sapkota, Brabeem and Bhattarai, Amrita and Manandhar, Suresh and Sah, Anand Kumar},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={607--614},
year={2024}
}
Please feel free to contact me if any help needed
Email: [email protected]