Zefan-Cai / KVCache-Factory
Unified KV Cache Compression Methods for LLMs
☆728Updated this week
Related projects ⓘ
Alternatives and complementary repositories for KVCache-Factory
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆226Updated last month
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆230Updated 2 months ago
- The Official Repo of ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://a…☆355Updated this week
- The framework to prune LLMs to any size and any config.☆99Updated 8 months ago
- The repository for the paper titled "Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks"☆184Updated 3 weeks ago
- ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference☆124Updated 3 weeks ago
- MIXQ: Taming Dynamic Outliers in Mixed-Precision Quantization by Online Prediction☆81Updated 3 weeks ago
- [NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy☆48Updated 2 weeks ago
- Support mixed-precsion inference with vllm☆95Updated 2 weeks ago
- Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models☆129Updated 2 weeks ago
- Multilingual Corpus of Web Fiction☆216Updated 4 months ago
- Benchmarking LLMs via Uncertainty Quantification☆221Updated 9 months ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆498Updated 3 months ago
- The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"☆770Updated 2 months ago
- Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"☆225Updated 3 months ago
- Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"☆95Updated last month
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].☆337Updated 3 weeks ago
- One-stop data intelligence agent, providing insights from all mainstream data formats in a single dialogue box, including documents, data…☆504Updated 2 weeks ago
- ☆140Updated 3 years ago
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆97Updated 4 months ago
- Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…☆161Updated this week
- [NeurIPS 2022] Official Code for REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering☆134Updated 2 months ago
- ☆17Updated 2 years ago
- DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models☆134Updated 2 months ago
- Mixed precision inference by Tensorrt-LLM☆93Updated 3 weeks ago
- Awesome LLMs on Device: A Comprehensive Survey☆931Updated last month
- (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions☆269Updated 7 months ago
- Unofficial Implementation of ReplaceAnything: https://aigcdesigngroup.github.io/replace-anything/☆528Updated 5 months ago
- Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models☆29Updated 2 weeks ago
- We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that …☆117Updated last year