[ICLR2026] The first W4A4KV4 quantized + 50% sparse LLMs!
☆25Jan 26, 2026Updated 2 months ago
Alternatives and similar repositories for OBR
Users that are interested in OBR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV2025]Generate one 2K image on single 24GB 3090 GPU!☆84Sep 8, 2025Updated 6 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆37Oct 3, 2025Updated 5 months ago
- [ICML2025] LoRA fine-tune directly on the quantized models.☆39Nov 25, 2024Updated last year
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆83Feb 10, 2026Updated last month
- [CVPR'26 Findings] Source code for "RADSeg Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglom…☆36Mar 7, 2026Updated 2 weeks ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆18Apr 16, 2025Updated 11 months ago
- Official implementation for "Pruning Large Language Models with Semi-Structural Adaptive Sparse Training" (AAAI 2025)☆18Jul 1, 2025Updated 8 months ago
- Minute-long video generation at 24FPS.☆59Feb 2, 2026Updated last month
- [KDD'25] Combinatorial Optimization Perspective based Framework for Multi-behavior Recommendation☆18Jan 7, 2025Updated last year
- Implementation of the paper 'Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance' (EMNLP 2025)☆28Dec 16, 2025Updated 3 months ago
- Official Code of The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks[ICML2022]☆17Sep 20, 2022Updated 3 years ago
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Feb 26, 2025Updated last year
- NEU Surface defect classification with ResNet.☆15Jul 4, 2023Updated 2 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated last week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆98Feb 21, 2025Updated last year
- EfficientFlow: Efficient Equivariant Flow Policy Learning for Embodied AI☆24Jan 17, 2026Updated 2 months ago
- image demoireing, moire synthesis☆16Apr 25, 2024Updated last year
- Music Modeling Kit☆22Jan 10, 2025Updated last year
- SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)☆35Nov 28, 2025Updated 3 months ago
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆53Mar 25, 2025Updated last year
- [ICCV-2023] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization☆28Dec 6, 2023Updated 2 years ago
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago
- [PR 2024] HTQ: Exploring the High-Dimensional Trade-Off of Mixed-Precision Quantization☆12Jul 16, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆15Mar 21, 2025Updated last year
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 7 months ago
- [ACMMM 2024] Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors☆25Oct 22, 2024Updated last year
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- My academic homepage☆15Jan 15, 2022Updated 4 years ago
- ☆40Nov 22, 2025Updated 4 months ago
- [NeurIPS 2024] COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing☆25Dec 8, 2024Updated last year
- Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"☆21Mar 17, 2026Updated last week
- Implementation of Effective Sparsification of Neural Networks with Global Sparsity Constraint☆31Mar 24, 2022Updated 4 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆10Jul 27, 2024Updated last year
- Automated sum-of-squares (SOS) Prover for Algebraic Inequalities | Python-based tool with GUI & API | Generates readable sum-of-squares p…☆31Mar 13, 2026Updated 2 weeks ago
- ☆10Mar 2, 2024Updated 2 years ago
- Simple and efficient memory pool is implemented with C++11.☆10Jun 2, 2022Updated 3 years ago
- A lightweight logging library to trace C++ variables.☆15Feb 10, 2026Updated last month
- Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization☆16Sep 13, 2022Updated 3 years ago
- [ACL 2024] Official PyTorch implementation of "IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact"☆48May 24, 2024Updated last year