基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
-
Updated
Feb 19, 2025 - Python
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
Efficient & Generic Video Super-Resolution
A real-time silent speech recognition tool.
It is a re-implementation of paper named "Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation" called VSR-DUF model. There are both training codes and test codes about VSR-DUF based tensorflow.
Official repository containing code and other material from the paper "Efficient Video Super-Resolution through Recurrent Latent Space Propagation" (https://arxiv.org/abs/1909.08080).
[Interspeech 2024] SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
This is an official implementation of Video Super-Resolution via a Spatio-Temporal Alignment Network.
HiRN: Hierarchical Recurrent Neural Network for Video Super-Resolution (VSR) using Two-Stage Feature Evolution - Official Repository (Applied Soft Computing)
Group-based Bi-Directional Recurrent Wavelet Neural Network for Efficient Video Super-Resolution (VSR) - Official Repository (Pattern Recognition Letters)
ICIP2024 challenge for 360 super resolution
Speaker-Independent Speech Recognition using Visual Features
HiRN: Hierarchical Recurrent Neural Network for Video Super-Resolution (VSR) using Two-Stage Feature Evolution - Official Repository (Applied Soft Computing)
Group-based Bi-Directional Recurrent Wavelet Neural Network for Efficient Video Super-Resolution (VSR) - Official Repository (Pattern Recognition Letters)
AFFECTS OF ALCOHOL ON ARTICULATED VISUAL SPEECH
This repository contains the development of SynthAVSR, the first Audiovisual Speech Recognition (AVSR) system tailored for the Spanish and Catalan languages. Based on the AV-HuBERT (Audio-Visual Hidden Unit BERT) model, SynthAVSR leverages synthetic audiovisual data to bridge the gap in speech recognition technology for these languages.
Add a description, image, and links to the vsr topic page so that developers can more easily learn about it.
To associate your repository with the vsr topic, visit your repo's landing page and select "manage topics."