lucidrains / mind-evolution
Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind
☆49Updated 3 months ago
Alternatives and similar repositories for mind-evolution:
Users that are interested in mind-evolution are comparing it to the libraries listed below
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆68Updated 2 weeks ago
- A repository for research on medium sized language models.☆76Updated 11 months ago
- ☆78Updated 8 months ago
- Official repo of paper LM2☆39Updated 2 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆44Updated last month
- ☆63Updated last month
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆123Updated 8 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆39Updated 6 months ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆100Updated 3 weeks ago
- EvaByte: Efficient Byte-level Language Models at Scale☆91Updated 2 weeks ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- ☆16Updated 2 months ago
- ☆84Updated this week
- Implementation of Infini-Transformer in Pytorch☆110Updated 4 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆27Updated this week
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 3 months ago
- Explorations into improving ViTArc with Slot Attention☆40Updated 6 months ago
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆68Updated last week
- Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise☆34Updated 8 months ago
- GoldFinch and other hybrid transformer components☆45Updated 9 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 8 months ago
- Lego for GRPO☆27Updated last month
- ☆34Updated 4 months ago
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆24Updated 3 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆153Updated 3 weeks ago
- NanoGPT (124M) quality in 2.67B tokens☆28Updated this week
- ☆54Updated last month
- PyTorch implementation of models from the Zamba2 series.☆180Updated 3 months ago
- A list of language models with permissive licenses such as MIT or Apache 2.0☆24Updated 2 months ago
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆35Updated 2 months ago