Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos

Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos
accepted by EMNLP 2022 as an oral presentation
Songyang Zhang, Linfeng Song, Lifeng Jin, Haitao Mi, Kun Xu, Dong Yu and Jiebo Luo.

Video-aided grammar induction aims to leverage video information for finding more accurate syntactic grammars for accompanying text. While previous work focuses on building systems on well-aligned video-text pairs, we train our model only on noisy YouTube videos without finetuning on benchmark data and achieved stronger performances across three benchmarks.

[arxiv]

News

[Oct, 2022] Talk invited by UM-IoS EMNLP 2022 Workshop 😄
[Oct, 2022] Our paper has been accepted by EMNLP 2022 (Oral). ✨

Approach

Requirements

We provide Docker image for easier reproduction. Please install the following:

We only support Linux with NVIDIA GPUs. We test on Ubuntu 18.04 and V100 cards.

Quick Start

Launch Docker Container

CUDA_VISIBLE_DEVICES=0,1 source launch_container.sh $PATH_TO_STORAGE/data $PATH_TO_STORAGE/checkpoints $PATH_TO_STORAGE/log

The launch script respects $CUDA_VISIBLE_DEVICES environment variable. Note that the source code is mounted into the container under /src instead of built into the image so that user modification will be reflected without re-building the image.

Data Preparation

Please download the preprocessed data from here to data, and here to .cache.

[Optional] You can also preprocess data from raw captions. Details are described here.

Training

Run the following commands for training:

sh scripts/train.sh

Evaluation

Our trained model are provided here. Please download them to checkpoints. Then, run the following commands for evaluation:

sh scripts/test.sh

Preprocessing details

We preprocess subtitles with the following scripts:

python tools/preprocess_captions.py
python tools/compute_gold_trees.py
python tools/generate_vocabularies.py

Citation

If this project is useful for you, please consider citing our paper 📣

@inproceedings{zhang2022training,
title={Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos},
author={Zhang, Songyang and Song, Linfeng and Jin, Lifeng and Mi, Haitao and Xu, Kun and Yu, Dong and Luo, Jiebo},
booktitle={EMNLP},
year={2022}

Acknowledgements

This repo is developed based on VPCFG, MMC-PCFG and Punctuator2.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
experiments/HowTo100M		experiments/HowTo100M
figures		figures
lib		lib
parsing		parsing
scripts		scripts
tools		tools
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
launch_container.sh		launch_container.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos

News

Approach

Requirements

Quick Start

Launch Docker Container

Data Preparation

Training

Evaluation

Preprocessing details

Citation

Acknowledgements

About

Languages

License

Sy-Zhang/PTC-PCFG

Folders and files

Latest commit

History

Repository files navigation

Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos

News

Approach

Requirements

Quick Start

Launch Docker Container

Data Preparation

Training

Evaluation

Preprocessing details

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages