lucidrains / ultra-memLinks
Implementation of UltraMem, improved Product Key Memory design, from Bytedance AI labs
☆27Updated last month
Alternatives and similar repositories for ultra-mem
Users that are interested in ultra-mem are comparing it to the libraries listed below
Sorting:
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆118Updated 3 weeks ago
- Implementation of the new SOTA for model based RL, from the paper "Improving Transformer World Models for Data-Efficient RL", in Pytorch☆145Updated 7 months ago
- FlashRNN - Fast RNN Kernels with I/O Awareness☆169Updated last month
- Implementation of Soft Actor Critic and some of its improvements in Pytorch☆60Updated 9 months ago
- Pytorch implementation of Evolutionary Policy Optimization, from Wang et al. of the Robotics Institute at Carnegie Mellon University☆102Updated 2 months ago
- Implementation of ReWiND, "Language-Guided Rewards Teach Robot Policies without New Demonstrations", from USC / Amazon Robotics☆35Updated 3 months ago
- 📄Small Batch Size Training for Language Models☆68Updated 2 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆112Updated 2 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆100Updated last year
- Implementation of a transformer for reinforcement learning using `x-transformers`☆69Updated 2 months ago
- ☆28Updated last year
- JAX bindings for Flash Attention v2☆99Updated last month
- Jax like function transformation engine but micro, microjax☆33Updated last year
- H-Net Dynamic Hierarchical Architecture☆80Updated 2 months ago
- Exploration into the Firefly algorithm in Pytorch☆41Updated 9 months ago
- Easily run PyTorch on multiple GPUs & machines☆54Updated 3 weeks ago
- Normalized Transformer (nGPT)☆194Updated last year
- Fast reinforcement learning 💨☆28Updated 4 months ago
- Flash Attention Triton kernel with support for second-order derivatives☆116Updated last month
- ☆38Updated last year
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆32Updated 6 months ago
- Open-source implementation of AlphaEvolve☆23Updated 6 months ago
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆57Updated last year
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆37Updated last year
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆125Updated 2 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆57Updated 6 months ago
- ☆90Updated last year
- Supporting code for the blog post on modular manifolds.☆104Updated 2 months ago
- ☆35Updated last year
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆132Updated last month