elyxlz / givt-pytorchView external linksLinks
A partial implementation of Generative Infinite Vocabulary Transformer (GIVT) from Google Deepmind, in PyTorch.
☆21Mar 28, 2024Updated last year
Alternatives and similar repositories for givt-pytorch
Users that are interested in givt-pytorch are comparing it to the libraries listed below
Sorting:
- [CVPR 2024] SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos☆17May 21, 2024Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21May 26, 2025Updated 8 months ago
- An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation☆13Sep 9, 2024Updated last year
- Official PyTorch implementation for "Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations"☆41Apr 23, 2024Updated last year
- A virtual musical instrument built using Google MediaPipe.☆12Oct 10, 2022Updated 3 years ago
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆14Jul 31, 2025Updated 6 months ago
- ☆10Jan 25, 2026Updated 3 weeks ago
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated 10 months ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Code to reproduce experiments in Markovian Flow Matching: Accelerating MCMC with Continuous Normalizing Flows☆13May 23, 2024Updated last year
- ☆13Sep 25, 2024Updated last year
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆20Oct 29, 2025Updated 3 months ago
- Format to store media files and annotations☆12Feb 5, 2026Updated last week
- An Android Application for GLCC☆11Sep 30, 2022Updated 3 years ago
- An open source community implementation of the model MELLE from the paper: "Autoregressive Speech Synthesis without Vector Quantization"☆14Feb 9, 2026Updated last week
- [TNNLS 2022] Official pytorch implementation of "Tackling the Challenges in Scene Graph Generation with Local-to-Global Interactions"☆11Apr 19, 2022Updated 3 years ago
- AutoHotKey script that utilize your (probably) useless CapsLock as Magic Fn, available for pretty much every keyborard.☆10Jun 30, 2022Updated 3 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- Official implementation of INTERSPECCH 2022 Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals☆16Sep 19, 2025Updated 4 months ago
- Region Proposal generation on images using clustering in Pointcloud - Currently only for Pedestrians☆11Jul 13, 2020Updated 5 years ago
- Latex template for CUHK PhD Thesis☆11Jun 29, 2025Updated 7 months ago
- Gaussian Splating 2d implemented in triton☆11Mar 19, 2024Updated last year
- ☆13Oct 25, 2024Updated last year
- Diffusion-based Speech Enhancement: Demonstration of Performance and Generalization☆11Dec 21, 2024Updated last year
- Pybind11 bindings for Kaldi☆15Feb 1, 2026Updated 2 weeks ago
- ☆13Jul 10, 2021Updated 4 years ago
- 大量の音声データから笑い声部分を集めるやつ☆12May 23, 2024Updated last year
- ☆12Feb 23, 2024Updated last year
- Benchopt benchmark for ResNet fitting on a classification task☆12Sep 19, 2023Updated 2 years ago
- Render wav and convert it with [Diff-SVC](https://github.com/prophesier/diff-svc) model☆10Aug 24, 2025Updated 5 months ago
- The OBMO module embedded in PatchNet☆10Feb 21, 2024Updated last year
- ☆13Oct 11, 2024Updated last year
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago
- This is an efficient cuda implementation of 2D depthwise convolution for large kernel, it can be used in Pytorch deep learning framework.☆11Sep 28, 2023Updated 2 years ago
- Third-party toolkit for Rope3D dataset☆13Jun 13, 2022Updated 3 years ago
- This is Pytorch implementation of our paper "LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition".☆11Sep 23, 2024Updated last year
- narabas: Japanese phoneme forced alignment tool☆13Mar 15, 2023Updated 2 years ago
- The official implementation of dLLM-Var☆31Nov 6, 2025Updated 3 months ago
- ☆14Aug 6, 2022Updated 3 years ago