cloneofsimo / minVJEPALinks
☆22Updated last month
Alternatives and similar repositories for minVJEPA
Users that are interested in minVJEPA are comparing it to the libraries listed below
Sorting:
- ☆59Updated 3 months ago
- research impl of Native Sparse Attention (2502.11089)☆54Updated 4 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 2 months ago
- [ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction☆31Updated last month
- ☆33Updated 6 months ago
- Focused on fast experimentation and simplicity☆76Updated 6 months ago
- ☆23Updated last year
- ☆32Updated 8 months ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆54Updated 4 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Updated 4 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- ☆17Updated 7 months ago
- RS-IMLE☆41Updated 7 months ago
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆37Updated 4 months ago
- ☆24Updated 2 months ago
- Code for the paper "Function-Space Learning Rates"☆20Updated last month
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆91Updated last month
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆49Updated 4 months ago
- ☆34Updated 10 months ago
- A repository for research on medium sized language models.☆77Updated last year
- Official implementation of ECCV24 paper: POA☆24Updated 11 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆55Updated last year
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated last week
- ☆63Updated 9 months ago
- Lottery Ticket Adaptation☆39Updated 7 months ago
- RWKV-7: Surpassing GPT☆92Updated 7 months ago
- Synthetic Alphabet Dataset☆19Updated 3 months ago
- ☆24Updated last year
- Resa: Transparent Reasoning Models via SAEs☆39Updated last month
- Collection of autoregressive model implementation☆85Updated 2 months ago