☆82Nov 11, 2024Updated last year
Alternatives and similar repositories for LoQT
Users that are interested in LoQT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of ICLR 2025 'LORO: Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization'☆16Apr 24, 2025Updated 11 months ago
- This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.☆106Jul 1, 2024Updated last year
- Fine-tuning Quantized Neural Networks with Zeroth-order Optimization☆17Sep 17, 2025Updated 6 months ago
- ☆17Dec 7, 2025Updated 3 months ago
- [NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections☆21Oct 15, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [EMNLP 2024] Quantize LLM to extremely low-bit, and finetune the quantized LLMs☆15Jul 18, 2024Updated last year
- ☆13Jan 15, 2025Updated last year
- ☆33Nov 11, 2024Updated last year
- A pure and fast NumPy implementation of Mamba with cache support.☆18Jun 16, 2024Updated last year
- ☆56Jul 7, 2025Updated 8 months ago
- ☆18Feb 23, 2026Updated last month
- SwiftLet is a lightweight Python framework for running open-source Large Language Models (LLMs) locally using safetensors☆28Aug 6, 2025Updated 7 months ago
- WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models☆22Jul 10, 2025Updated 8 months ago
- [NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains☆84Jul 29, 2025Updated 7 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This repository contains code for the MicroAdam paper.☆21Dec 14, 2024Updated last year
- [ICML2025] LoRA fine-tune directly on the quantized models.☆39Nov 25, 2024Updated last year
- ☆20Oct 13, 2024Updated last year
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆59Dec 1, 2024Updated last year
- Exploring the minimal architecture required for coherent English language generation.☆12Mar 5, 2025Updated last year
- ☆15Nov 7, 2024Updated last year
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- ☆39Aug 27, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated 2 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code repo for the paper "SpinQuant LLM quantization with learned rotations"☆380Feb 14, 2025Updated last year
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Oct 30, 2025Updated 4 months ago
- ☆22Dec 1, 2021Updated 4 years ago
- ☆14Dec 6, 2023Updated 2 years ago
- Rotation equivariance meets local feature matching☆18Oct 20, 2022Updated 3 years ago
- 🌳 MCTS-inspired parallel beam search for conversation optimization. Explore multiple dialogue strategies simultaneously, stress-test a…☆35Jan 18, 2026Updated 2 months ago
- ☆53Oct 29, 2024Updated last year
- This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)☆18Jan 9, 2025Updated last year
- Writing Tools, Apple's AI-inspired app, enchants Windows, enhancing your pen with AI LLMs. One hotkey press, system-wide, fixes grammar, …☆27Jul 26, 2025Updated 8 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆90Updated this week
- A Python reimplementation + extension of "Planning with Large Language Models for Code Generation" (https://arxiv.org/abs/2303.05510)☆18Dec 1, 2023Updated 2 years ago
- Salient Objects in Clutter, arXiv, 2021 (ECCV2018 extenstion).☆11Jun 17, 2021Updated 4 years ago
- ☆30Aug 27, 2024Updated last year
- SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)☆35Nov 28, 2025Updated 3 months ago
- ☆13Feb 28, 2024Updated 2 years ago
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆144Apr 8, 2025Updated 11 months ago