Python library for handling audio datasets.
-
Updated
Jul 6, 2023 - Python
Python library for handling audio datasets.
Trainable categorization tool
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework
Cleaning discord data for NLP
Command-line filter for GitHub repositories that contain "samples", instead of real project or framework or library
[ACL 2024 (Findings)] ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation
A simple library that wraps common data processing tasks into an easy to use preprocessing engine. The library currently supports transformation of csv files loaded into Pandas dataframe.
A set of tools to generate and label dataset from academic papers
Compare pictures, keep 2
Add a description, image, and links to the dataset-filtering topic page so that developers can more easily learn about it.
To associate your repository with the dataset-filtering topic, visit your repo's landing page and select "manage topics."