[ICLR2026] The first W4A4KV4 quantized + 50% sparse LLMs!
☆26Jan 26, 2026Updated 2 months ago
Alternatives and similar repositories for OBR
Users that are interested in OBR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV2025]Generate one 2K image on single 24GB 3090 GPU!☆84Sep 8, 2025Updated 7 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆37Oct 3, 2025Updated 6 months ago
- Official implementation for LaCo (EMNLP 2024 Findings)☆21Oct 3, 2024Updated last year
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆86Feb 10, 2026Updated 2 months ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆19Apr 16, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official implementation for "Pruning Large Language Models with Semi-Structural Adaptive Sparse Training" (AAAI 2025)☆19Jul 1, 2025Updated 9 months ago
- Minute-long video generation at 24FPS.☆61Mar 28, 2026Updated 2 weeks ago
- Implementation of the paper 'Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance' (EMNLP 2025)☆27Dec 16, 2025Updated 3 months ago
- Official Code of The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks[ICML2022]☆16Sep 20, 2022Updated 3 years ago
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Feb 26, 2025Updated last year
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated 3 weeks ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆98Feb 21, 2025Updated last year
- SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)☆35Nov 28, 2025Updated 4 months ago
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆53Mar 25, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ICCV-2023] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization☆28Dec 6, 2023Updated 2 years ago
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago
- [PR 2024] HTQ: Exploring the High-Dimensional Trade-Off of Mixed-Precision Quantization☆12Jul 16, 2024Updated last year
- ☆15Mar 21, 2025Updated last year
- [ACMMM 2024] Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors☆25Oct 22, 2024Updated last year
- My academic homepage☆15Jan 15, 2022Updated 4 years ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- ☆40Nov 22, 2025Updated 4 months ago
- [NeurIPS 2024] ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis☆25Nov 28, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"☆21Mar 25, 2026Updated 3 weeks ago
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆11Jul 27, 2024Updated last year
- ☆59May 19, 2025Updated 10 months ago
- Automated sum-of-squares (SOS) Prover for Algebraic Inequalities | Python-based tool with GUI & API | Generates readable sum-of-squares p…☆32Apr 4, 2026Updated last week
- ☆10Mar 2, 2024Updated 2 years ago
- Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization☆16Sep 13, 2022Updated 3 years ago
- Simple and efficient memory pool is implemented with C++11.☆10Jun 2, 2022Updated 3 years ago
- We present Global Search Optics (GSO) to automatically design compact computational imaging systems.☆12Mar 19, 2025Updated last year
- [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.☆178Oct 3, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- People detector based on SSD☆13Mar 7, 2021Updated 5 years ago
- ☆16Sep 27, 2023Updated 2 years ago
- ☆19Jul 20, 2022Updated 3 years ago
- 数据库内核笔记☆13Aug 18, 2022Updated 3 years ago
- Codes for skeleton extraction from point clouds. Created by Xin Li☆18Nov 22, 2021Updated 4 years ago
- ☆21Jun 3, 2023Updated 2 years ago
- Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"☆12Oct 14, 2025Updated 6 months ago