ChengZhang-98 / llm-mixed-q
mixed-precision quantization for LLMs
☆13Updated last year
Related projects ⓘ
Alternatives and complementary repositories for llm-mixed-q
- ☆79Updated 11 months ago
- ☆20Updated this week
- Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs☆24Updated 4 months ago
- Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)☆12Updated 4 months ago
- Simulator for BitFusion☆90Updated 4 years ago
- MICRO22 artifact evaluation for Sparseloop☆38Updated 2 years ago
- ☆33Updated 4 months ago
- BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)☆36Updated 3 years ago
- [HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning☆75Updated 2 months ago
- ☆19Updated 3 years ago
- Neural Network Quantization With Fractional Bit-widths☆12Updated 3 years ago
- ☆18Updated 2 years ago
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆27Updated last year
- ☆24Updated 7 months ago
- ☆38Updated 7 months ago
- Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"☆23Updated last year
- Linux docker for the DNN accelerator exploration infrastructure composed of Accelergy and Timeloop☆45Updated this week
- A co-design architecture on sparse attention☆44Updated 3 years ago
- [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…☆22Updated 2 years ago
- QuickEst repository: Quick Estimation of Quality of Results☆26Updated 6 years ago
- ☆31Updated 3 years ago
- ☆81Updated 4 months ago
- Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.☆38Updated this week
- Torch2Chip (MLSys, 2024)☆50Updated 2 months ago
- RTL implementation of Flex-DPE.☆89Updated 4 years ago
- ☆22Updated last year
- A scheduler for spatial DNN accelerators that generate high-performance schedules in one shot using mixed integer programming (MIP)☆74Updated last year
- A Spatial Accelerator Generation Framework for Tensor Algebra.☆52Updated 2 years ago
- ☆20Updated 2 years ago
- ☆30Updated 4 years ago