s-sahoo / Eso-LMsLinks
Esoteric Language Models
☆81Updated last week
Alternatives and similar repositories for Eso-LMs
Users that are interested in Eso-LMs are comparing it to the libraries listed below
Sorting:
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆53Updated 3 months ago
- ☆59Updated 3 months ago
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆86Updated 3 weeks ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆33Updated 3 months ago
- ☆80Updated 10 months ago
- A repository for research on medium sized language models.☆77Updated last year
- ☆20Updated last week
- Official repo of paper LM2☆41Updated 4 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆54Updated last month
- Using FlexAttention to compute attention with different masking patterns☆44Updated 9 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆40Updated 8 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆33Updated 3 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆103Updated 2 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆29Updated 9 months ago
- GoldFinch and other hybrid transformer components☆45Updated 11 months ago
- ☆18Updated 3 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆39Updated 3 months ago
- ☆65Updated 3 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆101Updated 3 months ago
- [ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluating☆41Updated this week
- [ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teaches☆50Updated 3 months ago
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated this week
- Reinforcing General Reasoning without Verifiers☆63Updated last week
- ☆37Updated 2 months ago
- ☆70Updated 10 months ago
- Official Jax Implementation of MD4 Masked Diffusion Models☆108Updated 4 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆63Updated 2 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆103Updated 2 months ago
- NanoGPT (124M) quality in 2.67B tokens☆28Updated this week
- Official implementation of the paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"☆173Updated last month