VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation

Wentao Zhao*, Jiaming Chen*, Ziyu Meng, Donghui Mao, Ran Song†, Wei Zhang

Shandong University

This is the official repo for our Paper: VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation, which is accepted by RSS2024.

We provide the implementation of VLMPC in Language-Table environment.

Installation

conda create -n vlmpc python=3.10
conda activate vlmpc

pip install -r requirements.txt

Note: Add your Openai API key to the vlmpc.py.

Quickstart

We provide the trained checkpoints of video prediction model and detector, download them for quick start:

python main.py --checkpoint_file path/to/video_prediction_model/checkpoint --task push_corner --zoom 0.03 --num_samples 50 --plan_freq 3 --det_path path/to/detector/checkpoint

Citation

@inproceedings{zhao2024vlmpc,
    title={VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation},
    author={Zhao, Wentao and Chen, Jiaming and Meng, Ziyu and Mao, Donghui and Song, Ran and Zhang, Wei},
    booktitle={Robotics: Science and Systems},
    year={2024},
    }

Acknowledgements

Environment is based on Language-Table.
The implementation of DMVFN-Act video prediction model is based on DMVFN.
PySOT for lightweight visual tracking.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
demo		demo
language_table		language_table
prompt_examples		prompt_examples
pysot_tracker		pysot_tracker
vp_models/dmvfn		vp_models/dmvfn
yolov5		yolov5
.gitignore		.gitignore
ReadMe.md		ReadMe.md
detect_bbx.py		detect_bbx.py
framework.jpg		framework.jpg
logger.py		logger.py
main.py		main.py
prompt_gpt.py		prompt_gpt.py
requirements.txt		requirements.txt
sampler.py		sampler.py
tools.py		tools.py
video_interface.py		video_interface.py
vlmpc.py		vlmpc.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation

Installation

Quickstart

Citation

Acknowledgements

About

Releases

Packages

Languages

PPjmchen/vlmpc

Folders and files

Latest commit

History

Repository files navigation

VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation

Installation

Quickstart

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages