4-bit Shampoo for Memory-Efficient Network Training (NeurIPS 2024)
☆13Feb 13, 2025Updated last year
Alternatives and similar repositories for low-bit-Shampoo
Users that are interested in low-bit-Shampoo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Zeroth-Order Fine-Tuning of LLMs in Random Subspaces (ICCV 2025)☆19Nov 22, 2024Updated last year
- [WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection☆17Mar 23, 2025Updated last year
- Single-thread, end-to-end C++ implementation of the Bitnet (1.58-bit weight) model☆14Nov 17, 2024Updated last year
- ☆17Dec 7, 2025Updated 4 months ago
- Official implementation of ICLR 2025 'LORO: Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization'☆16Apr 24, 2025Updated 11 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [WACV 2025] 🌍🚗 SpaGBOL: Spatial-Graph-Based Orientated Localisation 📡🗺️☆14Apr 9, 2025Updated last year
- [QT] 随机抽奖转盘(重写他人)☆10Feb 27, 2019Updated 7 years ago
- Arabic Grapheme-to-Phoneme (G2P) Conversion☆13Mar 15, 2025Updated last year
- Neural Homomorphic Vocoder optimized for singing voice synthesis☆28Mar 20, 2026Updated 3 weeks ago
- Speaker embedding for anime speech domain based on ECAPA_TDNN☆18Jun 22, 2025Updated 9 months ago
- Implementation of Q-Learning using TD error to navigate a maze avoiding obstacles and a moving enemy☆10Mar 4, 2018Updated 8 years ago
- This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)☆19Jan 9, 2025Updated last year
- Official Implementation of "GRIFFIN: Effective Token Alignment for Faster Speculative Decoding"[NeurIPS 2025]☆18May 12, 2025Updated 11 months ago
- PyTorch implementation of Hessian Free optimisation☆43Dec 19, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [IROS 2024] 🦜🌍 BEV-CV: Birds-Eye-View Transform for Cross-View Geo-Localisation 📡🗺️☆15Mar 4, 2025Updated last year
- implementation of https://arxiv.org/pdf/2312.09299☆21Jul 3, 2024Updated last year
- Secure, Multi-Tenant MCP Server Framework for Modern AI☆28Jun 9, 2025Updated 10 months ago
- Drax: Speech Recognition with Discrete Flow Matching☆75Oct 15, 2025Updated 6 months ago
- ☆33Oct 23, 2025Updated 5 months ago
- A repo based on XiLin Li's PSGD repo that extends some of the experiments.☆14Oct 7, 2024Updated last year
- ☆13Apr 25, 2024Updated last year
- Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models☆13Mar 9, 2024Updated 2 years ago
- KANs and MLPs☆12Jun 7, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆14Jun 22, 2025Updated 9 months ago
- [ISPRS P&RS'25] Official repository of the paper Cross-View Geo-Localization with Panoramic Street-View and VHR Satellite Imagery in Dece…☆21Nov 10, 2025Updated 5 months ago
- Train to 94% on CIFAR-10 in 4.4 seconds on a single A100☆12Dec 30, 2023Updated 2 years ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Jul 24, 2025Updated 8 months ago
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Jan 12, 2025Updated last year
- This is the official repository for the paper "Learning Sequence Descriptor based on Spatio-Temporal Attention for Visual Place Recogniti…☆18Oct 9, 2023Updated 2 years ago
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Dec 27, 2023Updated 2 years ago
- A universal multi-cloud data MCP Server supporting over 40 types of data source connections, providing secure, unified data access in a s…☆32Apr 10, 2026Updated last week
- ☆173Apr 7, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A small rust-based data loader☆36Feb 20, 2026Updated last month
- An implementation of DecorrelatedBN by tensorflow☆13Jun 30, 2022Updated 3 years ago
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆26Mar 17, 2025Updated last year
- ☆17Apr 3, 2026Updated 2 weeks ago
- The Okta MCP Server is a groundbreaking tool built by the team at Fctr that enables AI models to interact directly with your Okta environ…☆39Feb 9, 2026Updated 2 months ago
- Bullseye Polytope Clean-Label Poisoning Attack☆15Nov 5, 2020Updated 5 years ago
- Implementation of the multi-objective genetic optimization algorithm NSGA-II☆12Jun 22, 2025Updated 9 months ago