lucidrains / mind-evolution
Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind
☆47Updated last month
Alternatives and similar repositories for mind-evolution:
Users that are interested in mind-evolution are comparing it to the libraries listed below
- ☆74Updated 7 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆28Updated 2 weeks ago
- A repository for research on medium sized language models.☆76Updated 10 months ago
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆35Updated last month
- Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise☆33Updated 6 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆28Updated last week
- NanoGPT (124M) quality in 2.67B tokens☆28Updated last month
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆67Updated last week
- σ-GPT: A New Approach to Autoregressive Models☆62Updated 7 months ago
- ☆16Updated 3 weeks ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆56Updated 4 months ago
- Train, tune, and infer Bamba model☆86Updated 2 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- working implimention of deepseek MLA☆38Updated 2 months ago
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆36Updated last month
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆23Updated 2 months ago
- ☆32Updated 3 weeks ago
- ☆82Updated 3 weeks ago
- RWKV-7: Surpassing GPT☆82Updated 4 months ago
- ☆73Updated 6 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆121Updated 7 months ago
- Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI☆30Updated 4 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆27Updated last month
- Focused on fast experimentation and simplicity☆70Updated 3 months ago
- Implementation of Infini-Transformer in Pytorch☆110Updated 2 months ago
- My fork os allen AI's OLMo for educational purposes.☆30Updated 3 months ago
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆22Updated 3 months ago
- Official PyTorch implementation of TokenSet.☆88Updated last week
- Explorations into improving ViTArc with Slot Attention☆39Updated 5 months ago