SensiMix: Sensitivity-Aware 8-bit Index & 1-bit Value Mixed Precision Quantization for BERT Compression (PLOS One)
☆34Aug 22, 2025Updated 8 months ago
Alternatives and similar repositories for SensiMix
Users that are interested in SensiMix are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Model-Agnostic Augmentation for Accurate Graph Classification (WWW 2022)☆20Aug 22, 2025Updated 8 months ago
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆14Apr 29, 2025Updated last year
- 커버리스트 - 북 커버 생성 AI 서비스☆13Sep 11, 2022Updated 3 years ago
- ☆12Oct 9, 2023Updated 2 years ago
- ☆45Jun 25, 2025Updated 10 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inference☆13Jun 7, 2025Updated 11 months ago
- Structured Pruning Adapters in PyTorch☆19Aug 30, 2023Updated 2 years ago
- A a deep model that can accurately produce dense depth maps given an RGB image with known depth at a very sparse set of pixels.☆24Sep 26, 2020Updated 5 years ago
- An introductory graduate course on "Computer Vision for Embedded Systems"☆20Jun 12, 2022Updated 3 years ago
- 한국어 생성 문서의 원소 사실 관계에 대한 설명 기술☆17Dec 16, 2024Updated last year
- ☆29Nov 10, 2024Updated last year
- It gives a depth map for single image using pre-trained deep learning depth-net model.☆18May 7, 2019Updated 7 years ago
- A PyTorch implemenation of real XNOR-popcount (1-bit op) GEMM Linear PyTorch extension support both CPU and CUDA☆25Jun 6, 2023Updated 2 years ago
- AutoRAG example about benchmarking Korean embeddings.☆44Oct 2, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆45Oct 14, 2021Updated 4 years ago
- [ICLR 2023] RC-MAE☆52Dec 18, 2023Updated 2 years ago
- ☆42Oct 31, 2024Updated last year
- [ICCV 2025] SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs☆85Jan 17, 2026Updated 4 months ago
- [ECCV 2024] Efficient Inference of Vision Instruction-Following Models with Elastic Cache☆43Jul 26, 2024Updated last year
- PyTorch Transformer from scratch: BERT, ELECTRA pretraining, and Ko→En translation.☆54Jan 12, 2026Updated 4 months ago
- [CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models☆77Apr 16, 2026Updated last month
- ☆66Jan 23, 2026Updated 3 months ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆171Nov 26, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [DEPRECATED] 광운대학교 KLAS 사이트에 편리한 기능을 추가할 수 있는 유저 스크립트☆66Apr 15, 2023Updated 3 years ago
- Vision Language Models are Biased☆113Jan 26, 2026Updated 3 months ago
- [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models☆75Jan 6, 2024Updated 2 years ago
- Prune transformer layers☆74May 30, 2024Updated last year
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆89May 20, 2025Updated last year
- A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.☆89Oct 26, 2025Updated 6 months ago
- In progress.☆68Mar 26, 2024Updated 2 years ago
- ☆107Jun 10, 2025Updated 11 months ago
- Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]☆90Sep 13, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.☆113Jun 29, 2025Updated 10 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆127Jan 14, 2025Updated last year
- [NeurIPS'25 Oral] Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)☆219Feb 11, 2026Updated 3 months ago
- SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking (ACL 2019)☆90May 3, 2024Updated 2 years ago
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆123Mar 6, 2024Updated 2 years ago
- [CVPR 2025 Highlight] Official implementation of "Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity"☆138Mar 25, 2025Updated last year
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆145Mar 6, 2025Updated last year