A partial implementation of Generative Infinite Vocabulary Transformer (GIVT) from Google Deepmind, in PyTorch.
☆21Mar 28, 2024Updated last year
Alternatives and similar repositories for givt-pytorch
Users that are interested in givt-pytorch are comparing it to the libraries listed below
Sorting:
- [CVPR 2024] SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos☆17May 21, 2024Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21May 26, 2025Updated 9 months ago
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆15Jul 31, 2025Updated 7 months ago
- A virtual musical instrument built using Google MediaPipe.☆12Oct 10, 2022Updated 3 years ago
- Region Proposal generation on images using clustering in Pointcloud - Currently only for Pedestrians☆11Jul 13, 2020Updated 5 years ago
- ☆10Jan 25, 2026Updated last month
- An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation☆13Sep 9, 2024Updated last year
- Official PyTorch implementation for "Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations"☆42Apr 23, 2024Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated 11 months ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Official implementation of INTERSPECCH 2022 Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals☆16Sep 19, 2025Updated 5 months ago
- Gaussian Splating 2d implemented in triton☆11Mar 19, 2024Updated last year
- Latex template for CUHK PhD Thesis☆11Jun 29, 2025Updated 8 months ago
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆20Oct 29, 2025Updated 4 months ago
- AutoHotKey script that utilize your (probably) useless CapsLock as Magic Fn, available for pretty much every keyborard.☆10Jun 30, 2022Updated 3 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- ☆13Sep 25, 2024Updated last year
- ☆13Oct 25, 2024Updated last year
- Format to store media files and annotations☆12Feb 23, 2026Updated 2 weeks ago
- [TNNLS 2022] Official pytorch implementation of "Tackling the Challenges in Scene Graph Generation with Local-to-Global Interactions"☆11Apr 19, 2022Updated 3 years ago
- An open source community implementation of the model MELLE from the paper: "Autoregressive Speech Synthesis without Vector Quantization"☆14Mar 2, 2026Updated last week
- The OBMO module embedded in PatchNet☆10Feb 21, 2024Updated 2 years ago
- The official implementation of dLLM-Var☆31Nov 6, 2025Updated 4 months ago
- This repository contains the code for our ECCV 2022 paper on our "Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning".☆12Dec 6, 2022Updated 3 years ago
- ☆13Oct 11, 2024Updated last year
- ☆14Aug 6, 2022Updated 3 years ago
- ☆12Feb 23, 2024Updated 2 years ago
- Render wav and convert it with [Diff-SVC](https://github.com/prophesier/diff-svc) model☆10Aug 24, 2025Updated 6 months ago
- Benchopt benchmark for ResNet fitting on a classification task☆12Sep 19, 2023Updated 2 years ago
- This is an efficient cuda implementation of 2D depthwise convolution for large kernel, it can be used in Pytorch deep learning framework.☆11Sep 28, 2023Updated 2 years ago
- ☆13Jul 10, 2021Updated 4 years ago
- narabas: Japanese phoneme forced alignment tool☆13Mar 15, 2023Updated 2 years ago
- ☆22Jul 30, 2025Updated 7 months ago
- Wrapper scripts to make documentation easier☆13Feb 11, 2026Updated 3 weeks ago
- This is Pytorch implementation of our paper "LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition".☆11Sep 23, 2024Updated last year
- Lidar line downsampling for KITTI dataset, transfer lidar the number of lidar lines from 64 to 32, 16, 8, etc.☆13Jun 3, 2020Updated 5 years ago
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago
- Multi-lingual AudioCaps☆12Nov 20, 2023Updated 2 years ago
- Pybind11 bindings for Kaldi☆15Feb 1, 2026Updated last month