Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer
☆29Jun 1, 2021Updated 4 years ago
Alternatives and similar repositories for wav2vec2-huggingface-demo
Users that are interested in wav2vec2-huggingface-demo are comparing it to the libraries listed below
Sorting:
- This utility allows one to cut multiple clips from a single or multiple audio files.☆19Jun 23, 2021Updated 4 years ago
- MDMMT: Multidomain Multimodal Transformer for Video Retrieval☆26Jun 28, 2021Updated 4 years ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- 新词发现/新词挖掘/自由度/凝固度/python3☆10May 28, 2019Updated 6 years ago
- Style Transfer by Rigid Alignment in Neural Net Feature Space☆11Jan 23, 2021Updated 5 years ago
- 双路视频拼接☆13Nov 13, 2022Updated 3 years ago
- 豆瓣电影评论可视化☆10May 19, 2016Updated 9 years ago
- Mainly on text documents. Implemented a Mini Search Engine using different algorithms and then summaried documents using lexrank.☆11Jan 19, 2018Updated 8 years ago
- ☆12Aug 30, 2022Updated 3 years ago
- Detects scene change or cuts in a video file☆11Oct 23, 2017Updated 8 years ago
- 一个支持跨模态大语言模型的webui. A chatbot webui that supports various multi-modal large language models☆11May 8, 2023Updated 2 years ago
- A reinforcement learning package implemented in Torch☆11Jan 24, 2016Updated 10 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 4 years ago
- This repository includes the code to reproduce our paper [Explainable deepfake and spoofing detection: an attack analysis using SHapley A…☆12Jan 24, 2024Updated 2 years ago
- Video Summarization Transformer: Implementation in PyTorch of the Transformer model for video summarisation☆10Oct 27, 2020Updated 5 years ago
- Deep learning for named entity recognition on CoNLL-2003☆10Dec 23, 2016Updated 9 years ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 4 months ago
- Takes a list of vertices and faces, giving you back an array of individual triangles.☆11Nov 18, 2015Updated 10 years ago
- ☆10Jul 20, 2020Updated 5 years ago
- 一个基于trie树的具有联想功能的文本编辑器。采用python和pyqt☆10Sep 7, 2016Updated 9 years ago
- ☆10Mar 28, 2023Updated 2 years ago
- Port of Chromaprint C/C++ library to Ruby to extract fingerprints from audio sources.☆12Nov 7, 2013Updated 12 years ago
- A feishu bot daily push arxiv latest articles.☆10Nov 28, 2021Updated 4 years ago
- [IJCV 2025] The official implementation of "AnyPattern: Towards In-context Image Copy Detection"☆10Oct 24, 2025Updated 4 months ago
- Pytorch implementation of various token mixers; Attention Mechanisms, MLP, and etc for understanding computer vision papers and other tas…☆16Oct 7, 2024Updated last year
- Action recognition based on action graph, which describes the spatio-temporal relationship between dense trajectory clusters. The program…☆11Jan 7, 2015Updated 11 years ago
- Midjourney X Instant Collage -- Collage Template + Grid + Quality Style☆12May 25, 2025Updated 9 months ago
- A source code of QuickExport☆11May 20, 2017Updated 8 years ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆10Apr 18, 2023Updated 2 years ago
- Code accompanying AES Semantic Audio Conference paper titled "A Dataset and Method for Guitar Solo Detection in Rock Music"☆12Jan 18, 2018Updated 8 years ago
- Minimal module for computing audio spectrograms☆15Feb 28, 2019Updated 7 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- 使用信号量加锁的循环共享内存队列☆11Sep 9, 2019Updated 6 years ago
- 关于behance爬虫项目☆10May 16, 2019Updated 6 years ago
- Code for paper: "RemovalNet: DNN model fingerprinting removal attack", IEEE TDSC 2023.☆10Nov 27, 2023Updated 2 years ago
- (We are still working on code refactoring and amending the necessary training and inferencing cli) An electric guitar transcription model…☆13Jan 11, 2023Updated 3 years ago
- ☆11Mar 19, 2024Updated last year
- Code accompanying the 2022 DLS paper "Misleading Deep-Fake Detection with GAN Fingerprints"☆10May 26, 2022Updated 3 years ago