[Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
☆16Feb 12, 2026Updated last month
Alternatives and similar repositories for mu_learned_optimization
Users that are interested in mu_learned_optimization are comparing it to the libraries listed below
Sorting:
- An efficient implementation of learned optimizers in PyTorch☆44Dec 2, 2025Updated 3 months ago
- ☆11Oct 11, 2023Updated 2 years ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 9 months ago
- ☆48Jan 18, 2024Updated 2 years ago
- A deep learning-powered visual navigation engine to enables autonomous navigation of pocket-size quadrotor - running on PULP☆13Oct 30, 2024Updated last year
- Code for the paper Normalizing Flows are Capable Models for RL☆18Jun 3, 2025Updated 9 months ago
- Reinforcement learning in pure JAX.☆13Dec 24, 2025Updated 2 months ago
- A prototype agent with the purpose of evaluating the performance of a Large Language Model within a python terminal.☆13Aug 28, 2023Updated 2 years ago
- ☆24Sep 25, 2024Updated last year
- ISMIR 2021: Curriculum Learning for Imbalanced Classification in Large Vocabulary Automatic Chord Recognition☆10Nov 8, 2021Updated 4 years ago
- A port of muP to JAX/Haiku☆25Oct 23, 2022Updated 3 years ago
- ACCO: An optimization algorithm for sharded distributed LLM training.☆13May 22, 2025Updated 10 months ago
- ☆13Apr 7, 2022Updated 3 years ago
- ☆30Feb 27, 2024Updated 2 years ago
- DeMo: Decoupled Momentum Optimization☆198Dec 2, 2024Updated last year
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆58Mar 10, 2025Updated last year
- Unofficial JAX implementation of the SOAP optimizer (https://arxiv.org/abs/2409.11321)☆25Jan 9, 2026Updated 2 months ago
- Web上に公開されている小説をスクレイピングして青空文庫形式のテキストにする☆19Feb 9, 2017Updated 9 years ago
- Official code for the paper "Attention as a Hypernetwork"☆55Feb 24, 2026Updated 3 weeks ago
- notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and produc…☆10Dec 25, 2024Updated last year
- Memory Replay with Data Compression (ICLR 2022)☆16Sep 26, 2023Updated 2 years ago
- Generative model for 3D objects.☆18Aug 12, 2023Updated 2 years ago
- ☆10Aug 18, 2016Updated 9 years ago
- Stick-breaking attention☆63Jul 1, 2025Updated 8 months ago
- ☆13Oct 8, 2021Updated 4 years ago
- Temporal WaSR-T model for maritime obstacle detection via semantic segmentation☆25Nov 29, 2023Updated 2 years ago
- HGRN2: Gated Linear RNNs with State Expansion☆56Aug 20, 2024Updated last year
- Code for "Optimizing Quantum Variational Circuits with Deep Reinforcement Learning"☆20May 10, 2024Updated last year
- Implementation for robust ViT and scaled attention☆21Apr 4, 2025Updated 11 months ago
- ☆20Oct 21, 2022Updated 3 years ago
- recipe for training fully-featured self supervised image jepa models☆12Jun 4, 2025Updated 9 months ago
- AdaSplash: Adaptive Sparse Flash Attention (aka Flash Entmax Attention)☆35Sep 30, 2025Updated 5 months ago
- Implementation for Object Permanence Emerges in a Random Walk along Memory☆22Dec 11, 2022Updated 3 years ago
- Generic build server☆64May 25, 2014Updated 11 years ago
- [WACV'24] Object Re-Identification from Point Clouds☆18Jan 16, 2026Updated 2 months ago
- For optimization algorithm research and development.☆561Mar 3, 2026Updated 2 weeks ago
- 🧱 Modula software package☆324Aug 18, 2025Updated 7 months ago
- Pipeline parallelism for the minimalist☆40Aug 6, 2025Updated 7 months ago
- simple dmabuf eglimage example☆10Sep 18, 2014Updated 11 years ago