Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 Workshop)
☆17Mar 6, 2025Updated last year
Alternatives and similar repositories for Grams
Users that are interested in Grams are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Mar 2, 2025Updated last year
- ☆13Apr 1, 2026Updated last week
- [AAAI-25 Oral] Adaptive Calibration☆15Jul 6, 2025Updated 9 months ago
- Implementation of the ICLR 2022 paper "Phase Collapse in Neural Networks."☆10Mar 21, 2022Updated 4 years ago
- [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models☆36Nov 4, 2025Updated 5 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆10Jun 19, 2023Updated 2 years ago
- Grokking on modular arithmetic in less than 150 epochs in MLX☆15Oct 24, 2024Updated last year
- Socks5 Proxy based on Websocket.☆15Jul 10, 2020Updated 5 years ago
- Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization (IEEE TPAMI 2021)☆17Jun 4, 2021Updated 4 years ago
- LLM training in simple, raw C/CUDA☆15Dec 5, 2024Updated last year
- Examples to control the Opal C1 from within python.☆17May 7, 2023Updated 2 years ago
- ☆262Dec 2, 2024Updated last year
- Repository for "CIRA Guide to Custom Loss Functions for Neural Networks in Environmental Sciences"☆17Jun 17, 2021Updated 4 years ago
- CIFAR-10 speedrun: Trains to 94% accuracy in 1.98 seconds on a single NVIDIA A100 GPU.☆73Oct 17, 2025Updated 5 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆34Dec 7, 2025Updated 4 months ago
- The code for paper Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models.☆13Apr 10, 2024Updated 2 years ago
- Resources regarding evML (edge verified machine learning)☆22Jan 4, 2025Updated last year
- implement of NoProp-CT☆28May 2, 2025Updated 11 months ago
- Code for "Practical Low-Rank Communication Compression in Decentralized Deep Learning"☆17Aug 4, 2020Updated 5 years ago
- succinct and unrestricted reflection☆14Mar 3, 2023Updated 3 years ago
- Continue to develop the dnSpy project☆12Apr 19, 2022Updated 3 years ago
- ☆15Sep 24, 2023Updated 2 years ago
- [NeurIPS 2024] Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling☆26Jul 10, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementation of Gradient Information Optimization (GIO) for effective and scalable training data selection☆14Jun 22, 2023Updated 2 years ago
- Data for the paper "A Dataset for Learning University STEM Courses at Scale" by Zhang et al., 2022.☆15Nov 22, 2022Updated 3 years ago
- ☆23Jan 5, 2025Updated last year
- ☆18Jun 9, 2021Updated 4 years ago
- (2023年最新版)使用Netty来连接B站直播间的弹幕信息流Websocket接口☆14Jan 13, 2024Updated 2 years ago
- A repository of Python & PyTorch scripts which (currently) converts .safetensors models into scaled FP8 variants, utilizing gradient desc…☆27Aug 8, 2025Updated 8 months ago
- A pytorch realization of adafactor (https://arxiv.org/pdf/1804.04235.pdf )☆26Aug 27, 2019Updated 6 years ago
- Course Project for CS224W at Stanford☆22Dec 10, 2021Updated 4 years ago
- PyTorch optimizer based on nonlinear conjugate gradient method☆30Apr 25, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆21Apr 12, 2024Updated 2 years ago
- Repository for Deep Learning Theory papers☆15Jan 24, 2024Updated 2 years ago
- Clean RL implementation using MLX☆34Mar 8, 2024Updated 2 years ago
- B站直播间弹幕野生接口。☆13Sep 13, 2023Updated 2 years ago
- Code for "The Expressive Power of Low-Rank Adaptation".☆20Apr 19, 2024Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆28Jul 13, 2023Updated 2 years ago
- Easily share your custom workflows for anyone to run☆22Oct 17, 2024Updated last year