☆35Mar 12, 2025Updated last year
Alternatives and similar repositories for SPAM-Optimizer
Users that are interested in SPAM-Optimizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Apr 1, 2026Updated last week
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 8 months ago
- Repo du cours d'introduction à l'apprentissage par renforcement.☆15Feb 2, 2025Updated last year
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 8 months ago
- ☆15Sep 22, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Work in progress.☆79Nov 25, 2025Updated 4 months ago
- ☆15Mar 2, 2025Updated last year
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆56Jan 27, 2025Updated last year
- [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models☆36Nov 4, 2025Updated 5 months ago
- Kinetics: Rethinking Test-Time Scaling Laws☆87Jul 11, 2025Updated 9 months ago
- ☆27Mar 29, 2025Updated last year
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆205Jul 17, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- An official repository for GPTailor☆17Jun 29, 2025Updated 9 months ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 7 months ago
- Control LLM☆22Apr 6, 2025Updated last year
- Analysing ML conference data and plotting interesting statistics.