This repository contains code for the MicroAdam paper.
☆21Dec 14, 2024Updated last year
Alternatives and similar repositories for MicroAdam
Users that are interested in MicroAdam are comparing it to the libraries listed below
Sorting:
- Kernel Library Wheel for SGLang☆17Updated this week
- ☆14Nov 3, 2025Updated 3 months ago
- ☆13Jan 15, 2025Updated last year
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆17Apr 16, 2025Updated 10 months ago
- Resources regarding evML (edge verified machine learning)☆22Jan 4, 2025Updated last year
- Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"☆20Jun 11, 2025Updated 8 months ago
- new optimizer☆20Aug 4, 2024Updated last year
- [ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely☆24Jun 26, 2024Updated last year
- ☆27Aug 25, 2023Updated 2 years ago
- ☆32Nov 11, 2024Updated last year
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆91Oct 30, 2024Updated last year
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆66Aug 3, 2025Updated 6 months ago
- Repository for go shared libraries (for now).☆11Dec 1, 2025Updated 2 months ago
- ☆11Mar 23, 2022Updated 3 years ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- TSDG: An efficient index graph for graph-based nearest neighbor search☆10Jul 14, 2022Updated 3 years ago
- rabitq rust implementation☆10Feb 4, 2026Updated 3 weeks ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- Efficient misspecification uncertainties for linear regression☆16Feb 19, 2026Updated last week
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- ☆20Oct 4, 2024Updated last year
- ☆18Nov 26, 2025Updated 3 months ago
- Invasion from the Unknown, a Battle for Wesnoth add-on campaign.☆10Mar 24, 2025Updated 11 months ago
- ☆11Dec 22, 2024Updated last year
- A curated list of awesome Molecular Modeling And Drug Discovery 🔥☆11Jul 21, 2022Updated 3 years ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- A bot that provides Youtube vid chapters on Twitter (a.k.a. X )☆12Feb 5, 2025Updated last year
- Unsupervised Word Discovery☆10Jul 26, 2019Updated 6 years ago
- Deep Autoencoding Predictive Components☆10Mar 4, 2021Updated 4 years ago
- Repository of GUI Action Narrator☆12Apr 8, 2025Updated 10 months ago
- Retrieval with Learned Similarities (http://arxiv.org/abs/2407.15462, WWW'25 Oral)☆52Apr 23, 2025Updated 10 months ago
- Express DLA implementation for FPGA, revised based on NVDLA.☆11Oct 17, 2019Updated 6 years ago
- ☆12Jan 23, 2026Updated last month
- A PyTorch implementation of Proxy Anchor Loss based on CVPR 2020 paper "Proxy Anchor Loss for Deep Metric Learning"☆11Jan 16, 2021Updated 5 years ago
- 工业级中文语音识别系统电子书☆13Oct 30, 2020Updated 5 years ago
- Seminar: intro to deep learning with tensorflow☆13Jun 27, 2017Updated 8 years ago
- FastAPI wrapper for LLM, a fork of (oobabooga / text-generation-webui)☆10Jun 1, 2023Updated 2 years ago
- ☆13Jun 18, 2024Updated last year