xfhelen / MMBenchLinks

An end-to-end benchmark suite of multi-modal DNN applications for system-architecture co-design

☆22

Alternatives and similar repositories for MMBench

Users that are interested in MMBench are comparing it to the libraries listed below

Sorting:

AboveParadise / LLMCBench
☆26Updated last year
CASIA-LMC-Lab / FLAP
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
☆63Updated last year
GATECH-EIC / ShiftAddViT
[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Updated last year
parsa-epfl / quantization-sparsity-interplay
This repo contains the code for studying the interplay between quantization and sparsity methods
☆24Updated 9 months ago
yxli2123 / LoSparse
☆62Updated 2 years ago
UbiquitousLearning / Paper-list-resource-efficient-large-language-model
☆101Updated last year
iamkanghyunchoi / ait
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher [CVPR 2022 Oral]
☆29Updated 3 years ago
A-suozhang / Papers-of-NAS
Personal Digest of NAS (Under Construction 🛠)
☆25Updated 5 years ago
VITA-Group / Q-Hitter
☆15Updated last year
ylsung / ECoFLaP
Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)
☆20Updated last year
thu-nics / qllm-eval
Code Repository of Evaluating Quantized Large Language Models
☆137Updated last year
BaiTheBest / SparseLLM
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
☆67Updated 8 months ago
MAC-AutoML / ITPruner
☆26Updated 3 years ago
RunpeiDong / DGMS
[ICML 2022 Spotlight] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
☆11Updated 2 years ago
wimh966 / outlier_suppression
The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…
☆49Updated 3 years ago
imagination-research / EEP
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
☆21Updated 3 weeks ago
1hunters / LIMPQ
Official implementation for ECCV 2022 paper LIMPQ - "Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance"
☆62Updated 2 years ago
decemberzhou / TF_TAS
☆36Updated 3 years ago
falcon-xu / early-exit-papers
A curated list of early exiting (LLM, CV, NLP, etc)
☆68Updated last year
zysxmu / FDDA
Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization
☆15Updated 3 years ago
SNU-ARC / any-precision-llm
[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
☆121Updated 5 months ago
GATECH-EIC / Edge-LLM
[DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive La…
☆71Updated last year
UbiquitousLearning / Efficient_Foundation_Model_Survey
Survey Paper List - Efficient LLM and Foundation Models
☆258Updated last year
HuangOwen / QAT-ACS
[TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"
☆35Updated last year
Gaffey / ExCP
Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".
☆48Updated last year
htqin / DSG
This project is the official implementation of our accepted IEEE TPAMI paper Diverse Sample Generation: Pushing the Limit of Data-free Qu…
☆14Updated 2 years ago
iamkanghyunchoi / qimera
Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples [NeurIPS 2021]
☆33Updated 3 years ago
machilusZ / FastGen
This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
☆41Updated last year
cornell-zhang / llm-datatypes
Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
☆27Updated last year
zyxxmu / cam
Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference
☆47Updated last year