Skip to content

showlab/GUI-Narrator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GUI Action Narrator: Where and When Did That Action Take Place?

Qinchen Wu, Difei Gao, Kevin Qinghong Lin, Zhuoyu Wu, Xiangwu Guo, Peiran Li, Weichen Zhang, Hengxu Wang, Mike Zheng Shou

🤖: Introduction

We introduce GUI action dataset Act2Cap as well as an effective framework: GUI Narrator for GUI video captioning that utilizes the cursor detection to enhance the interpretation of high-resolution screenshots and keyframe extraction in GUI actions.

📋 ToDo List

  • Model for Cursor detector and Narrator
  • Code of conduct

-- Our model and test benchmark are availble on Hugging Face.

About

Repository of GUI Action Narrator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published