Pexel Videos

358,551 video urls, average length 19.5s, and associated metadata from pexels.com.

Data was extracted from their video sitemaps (pexels.com/robots.txt) on 01/08/2022.

Data is stored in PexelVideos.parquet.gzip as a gzipped parquet

Instructions:

Clone this repo
Download the parquet gzip file which stores records for all videos to the base directory: wget https://huggingface.co/datasets/Corran/pexelvideos/resolve/main/PexelVideos.parquet.gzip
Install ffmpeg: conda install ffmpeg
Install python dependencies pip install -r requirements.txt
Download 10,000 videos (total # is editable in the script): python pexels.py
Split videos into chunks of 4s (total # is editable in script): python resize_videos.py - note: this is to reduce time loading in the data, this will crop the video to 256x256 and maintain aspect ratio
Set the pexels_chunks folder as the training data source for your training.

Provide feedback

Saved searches