erogol / BlaGPTLinks
Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible experimentation and exploration.
☆77Updated this week
Alternatives and similar repositories for BlaGPT
Users that are interested in BlaGPT are comparing it to the libraries listed below
Sorting:
- Repository for "TESS-2: A Large-Scale, Generalist Diffusion Language Model"☆48Updated 6 months ago
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆122Updated 3 months ago
- Implementation of Google's USM speech model in Pytorch☆30Updated 2 weeks ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆56Updated 5 months ago
- Implementation of a Light Recurrent Unit in Pytorch☆48Updated 10 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆161Updated 4 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆215Updated 3 months ago
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆65Updated last year
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆120Updated 9 months ago
- Randomized Positional Encodings Boost Length Generalization of Transformers☆82Updated last year
- Griffin MQA + Hawk Linear RNN Hybrid☆88Updated last year
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.☆68Updated 2 weeks ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆89Updated 2 months ago
- Audio tokenization, in the fastest way possible!☆52Updated last year
- ☆125Updated last week
- ☆85Updated last year
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆88Updated 10 months ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆70Updated last year
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆127Updated last year
- ☆132Updated last week
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆115Updated 8 months ago
- The official code for the SALMon🍣 benchmark (ICASSP 2025 - Oral)☆47Updated 2 weeks ago
- ☆60Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆98Updated 11 months ago
- ☆105Updated 3 weeks ago
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆40Updated 7 months ago
- AudioBERT 📢 : Audio Knowledge Augmented Language Model (ICASSP 2025)☆41Updated 7 months ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆42Updated last week
- Implementation of the proposed MaskBit from Bytedance AI☆82Updated 9 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆14Updated last year