Think-a-Tron / evolveLinks
open source alpha evolve
☆66Updated last month
Alternatives and similar repositories for evolve
Users that are interested in evolve are comparing it to the libraries listed below
Sorting:
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆55Updated last month
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆101Updated 4 months ago
- Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".☆128Updated 2 weeks ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆101Updated 6 months ago
- working implimention of deepseek MLA☆42Updated 6 months ago
- RWKV-7: Surpassing GPT☆92Updated 7 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆55Updated last year
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆103Updated 2 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 2 months ago
- ☆55Updated 7 months ago
- Collection of autoregressive model implementation☆85Updated 2 months ago
- PyTorch implementation of models from the Zamba2 series.☆183Updated 5 months ago
- Esoteric Language Models☆87Updated 3 weeks ago
- σ-GPT: A New Approach to Autoregressive Models☆65Updated 11 months ago
- Generic MCP Client to use any MCP tool in a chat☆44Updated 2 months ago
- Train, tune, and infer Bamba model☆130Updated last month
- Getting crystal-like representations with harmonic loss☆191Updated 3 months ago
- ☆59Updated 3 months ago
- DeMo: Decoupled Momentum Optimization☆189Updated 7 months ago
- ☆134Updated 10 months ago
- Focused on fast experimentation and simplicity☆76Updated 6 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆59Updated 8 months ago
- Explorations into whether a transformer with RL can direct a genetic algorithm to converge faster☆70Updated last month
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆127Updated 10 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆140Updated last month
- ☆88Updated last month
- ☆81Updated last year
- NanoGPT (124M) quality in 2.67B tokens☆28Updated 2 weeks ago
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆91Updated last month
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆87Updated 2 weeks ago