leloykun / adaptive-muonView external linksLinks
A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they change during training
☆19Jan 11, 2025Updated last year
Alternatives and similar repositories for adaptive-muon
Users that are interested in adaptive-muon are comparing it to the libraries listed below
Sorting:
- Code for paper Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks☆12Aug 9, 2022Updated 3 years ago
- ☆23Updated this week
- Github Repository for the HOI4 ULTRA Project.☆11Feb 8, 2026Updated last week
- A collection of niche / personally useful PyTorch optimizers with modified code.☆27Oct 25, 2025Updated 3 months ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆40Dec 2, 2023Updated 2 years ago
- ☆44Nov 1, 2025Updated 3 months ago
- Kinematic and dynamic models of continuum and articulated soft robots.☆15Nov 22, 2025Updated 2 months ago
- ☆50Aug 21, 2025Updated 5 months ago
- ☆54Dec 17, 2025Updated last month
- MATLAB function to fill an area with hatching ~~or speckling~~☆11Mar 4, 2018Updated 7 years ago
- BERT Sentiment Classification on the IMDb Large Movie Review Dataset.☆16Sep 8, 2022Updated 3 years ago
- Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"☆10Oct 23, 2021Updated 4 years ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 4 months ago
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆23Oct 23, 2025Updated 3 months ago
- An artificial matrix generator in C☆12Feb 16, 2023Updated 3 years ago
- A library for training crosscoders☆15May 28, 2025Updated 8 months ago
- ☆14Apr 14, 2025Updated 10 months ago
- Pre-train BERT from scratch, with HuggingFace. Accompanies the blog post: sidsite.com/posts/bert-from-scratch☆43May 20, 2025Updated 8 months ago
- Locality sensitive hash functions for Tensorflow 2.0.☆12Feb 18, 2022Updated 3 years ago
- Proof of Concept to learn Amaranth as an entry effort for Supercon's RTL design competition☆10Nov 11, 2022Updated 3 years ago
- ☆19Nov 20, 2025Updated 2 months ago
- sgx-based encrypted deduplication prototype☆14May 14, 2021Updated 4 years ago
- ☆13Jan 10, 2026Updated last month
- 4-bit Shampoo for Memory-Efficient Network Training (NeurIPS 2024)☆13Feb 13, 2025Updated last year
- (WIP) A relatively simple pipelined RISC-V core, written in Bluespec SystemVerilog☆12Sep 9, 2021Updated 4 years ago
- Testing Ibex build using Yosys and open source toolchains.☆11Oct 2, 2021Updated 4 years ago
- FPGA-based HyperLogLog Accelerator☆12Jul 13, 2020Updated 5 years ago
- Residual vector quantization for KV cache compression in large language model☆11Oct 22, 2024Updated last year
- A compressed SDL_Surface format using the LZ4 compression library.☆14Sep 28, 2022Updated 3 years ago
- Clust_mgr is an important compnent of KunlunBase. It provides a HTTP API for KunlunBase users to do cluster management, provisioning and …☆10Jun 13, 2023Updated 2 years ago
- A stream to RTL compiler based on MLIR and CIRCT☆16Nov 15, 2022Updated 3 years ago
- Single shot neural network pruning before training the model, based on connection sensitivity☆11Aug 7, 2019Updated 6 years ago
- Custom node to load Flux2 in INT8 for 2X Speed gains on 30 series cards.☆28Feb 7, 2026Updated last week
- CoMeT is a new low-cost RowHammer mitigation that uses Count-Min Sketch-based aggressor row tracking, as described in our HPCA'24 paper h…☆11Jan 23, 2026Updated 3 weeks ago
- APB UVC ported to Verilator☆11Nov 19, 2023Updated 2 years ago
- ☆11Oct 27, 2023Updated 2 years ago
- ☆12Jul 9, 2021Updated 4 years ago
- Create cohorts from databases utilizing the OMOP CDM☆11May 19, 2025Updated 8 months ago
- This simulator models multi core systems, intended primarily for studies on main memory management techniques. It models a trace-based ou…☆12Jan 18, 2016Updated 10 years ago