elyxlz/givt-pytorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/elyxlz/givt-pytorch)

elyxlz / givt-pytorch

A partial implementation of Generative Infinite Vocabulary Transformer (GIVT) from Google Deepmind, in PyTorch.

☆21

Alternatives and similar repositories for givt-pytorch

Users that are interested in givt-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
cloneofsimo / minSAE
View on GitHub
☆30Dec 2, 2024Updated last year
lavendery / UUG
View on GitHub
☆21Sep 14, 2025Updated 10 months ago
basiclab / FreeCond
View on GitHub
FreeCond: A Free Lunch for Input Conditions in Text-Guided Inpainting. FreeCond introduces a more generalized form💪 of the original inpa…
☆15May 22, 2025Updated last year
LittleMount / DescatterNet-for-unseen-real-world-objects
View on GitHub
In this study, we propose a deep-learning-based method to image through dynamic scattering media in a non-invasive manner under incoheren…
☆15Dec 1, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ga642381 / SpeechGen
View on GitHub
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
☆77Jun 9, 2023Updated 3 years ago
ExplainableML / Probabilistic_Deep_Metric_Learning
View on GitHub
This repository contains the code for our ECCV 2022 paper on our "Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning".
☆12Dec 6, 2022Updated 3 years ago
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
cnaigithub / SpeechDewarping
View on GitHub
Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023
☆27Apr 27, 2023Updated 3 years ago
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
abhinavtripathi95 / feature-tools
View on GitHub
This repository contains tools for visualization of keypoint matches over two images (ORB, SIFT, LIFT, SuperPoint, D2-Net).
☆13Jul 23, 2019Updated 7 years ago
jianglongye / featurenerf
View on GitHub
FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models, ICCV 2023
☆13Jul 13, 2024Updated 2 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
Theia-4869 / MoSA
View on GitHub
Official code of MoSA (Mixture of Sparse Adapters).
☆13Dec 14, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Freed-Wu / statistic-learning-homework
View on GitHub
A backup of my homework. 统计学习
☆11Jan 13, 2022Updated 4 years ago
jiasenlu / vit-vqgan-jax
View on GitHub
Jax implementation of VIT-VQGAN
☆10Jan 25, 2024Updated 2 years ago
benchopt / benchmark_resnet_classif
View on GitHub
Benchopt benchmark for ResNet fitting on a classification task
☆12Sep 19, 2023Updated 2 years ago
trestad / mitigating-reversal-curse
View on GitHub
Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'
☆14Aug 2, 2024Updated last year
JinchaoLove / CUHK-PhD-Thesis-Template
View on GitHub
Latex template for CUHK PhD Thesis
☆14Jun 29, 2025Updated last year
chengtianle1997 / Lidar_Line_Downsample
View on GitHub
Lidar line downsampling for KITTI dataset, transfer lidar the number of lidar lines from 64 to 32, 16, 8, etc.
☆13Jun 3, 2020Updated 6 years ago
light1726 / Speech-Tokenization-Papers
View on GitHub
This repository follows papers and reports on discrete speech representation learning and speech tokenization methods for speech language…
☆15Dec 1, 2023Updated 2 years ago
Li-Tong-621 / Class_homework
View on GitHub
北京理工大学2019级人工智能专业课程作业分享
☆17Apr 7, 2023Updated 3 years ago
ML-GSAI / RADD
View on GitHub
Official PyTorch implementation for "Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data" (ICLR…
☆84May 30, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
nikosips / Universal-Image-Embeddings
View on GitHub
A large-scale benchmark for the evaluation of embeddings across a number of fine-grained and instance-level visual domains.
☆17Jun 14, 2024Updated 2 years ago
liutaocode / AwesomeDiarizationDataset
View on GitHub
Both audio-only and audio-visual speaker diarization datasets are listed here.
☆16Feb 22, 2023Updated 3 years ago
TruongKhang / image-matching-toolbox
View on GitHub
This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.
☆12Jul 5, 2023Updated 3 years ago
Likarian / python-pointcloud-clustering
View on GitHub
python implementation of the paper 'Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation'
☆13Jan 4, 2021Updated 5 years ago
ML-GSAI / BFN-Solver
View on GitHub
Official PyTorch implementation for "Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations"
☆41Apr 23, 2024Updated 2 years ago
saicoco / webdataset
View on GitHub
pytorch大规模数据读取dataset
☆13May 30, 2022Updated 4 years ago
igul222 / plaid
View on GitHub
☆115May 29, 2023Updated 3 years ago
cuhealthybrains / MT-LLM
View on GitHub
The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"
☆51Apr 7, 2025Updated last year
cpdu / vallt
View on GitHub
☆36Mar 14, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sarulab-speech / DuplexChat
View on GitHub
☆47Jul 5, 2026Updated 3 weeks ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
JiauZhang / prompt-to-prompt
View on GitHub
Implementation of Prompt-to-Prompt Image Editing with Cross Attention Control
☆16Apr 5, 2023Updated 3 years ago
hamadichihaoui / mash
View on GitHub
☆18Oct 30, 2024Updated last year
qiuqiangkong / sampleRNN_acoustic_scene_generation
View on GitHub
☆14Apr 18, 2019Updated 7 years ago
blackyyen9596 / Speech_Recognition-PyTorch
View on GitHub
這是一個Speech_Recognition-PyTorch的開源碼，可以用於訓練自己的模型。
☆16Feb 2, 2021Updated 5 years ago
zaccharieramzi / submission-scripts
View on GitHub
All the submission scripts used for my work on Jean Zay and the TGCC
☆15Jul 17, 2023Updated 3 years ago