Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".
☆16Sep 15, 2024Updated last year
Alternatives and similar repositories for CritiPrefill
Users that are interested in CritiPrefill are comparing it to the libraries listed below
Sorting:
- ☆20Sep 28, 2024Updated last year
- 基于 CUDA Driver API 的 cuda 运行时环境☆15Jul 30, 2025Updated 7 months ago
- AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing☆44Sep 17, 2024Updated last year
- [ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention☆52Aug 6, 2025Updated 6 months ago
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆21Oct 28, 2024Updated last year
- CVPR 2025 Workshop on CVEU.☆42Jun 12, 2025Updated 8 months ago
- [ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models☆27Jul 7, 2025Updated 7 months ago
- ☆21Apr 17, 2025Updated 10 months ago
- The Official Implementation of Ada-KV [NeurIPS 2025]☆128Nov 26, 2025Updated 3 months ago
- ☆37Oct 16, 2025Updated 4 months ago
- ☆43Mar 15, 2025Updated 11 months ago
- VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models☆37Apr 9, 2025Updated 10 months ago
- ☆53Feb 24, 2026Updated last week
- AAPL: Adding Attributes to Prompt Learning for Vision-Language Models (CVPRw 2024)☆34May 8, 2024Updated last year
- Official Code for paper "Towards Efficient and Effective Unlearning of Large Language Models for Recommendation" (Frontiers of Computer S…☆38Jul 19, 2024Updated last year
- ☆88Sep 10, 2025Updated 5 months ago
- FocusLLM: Scaling LLM’s Context by Parallel Decoding☆44Dec 8, 2024Updated last year
- Protocol buffers and other common resources.☆13Jan 20, 2026Updated last month
- Keyword extraction using Scake, KeyBERT, Fine-tuning Transformer BERT-like models and ChatGPT.☆12May 22, 2023Updated 2 years ago
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 7 months ago
- ☆10Oct 13, 2024Updated last year
- Federated Transformer (NeurIPS 24): a framework to enhance the performance of multi-party Vertical Federated Learning involving fuzzy ide…☆41Dec 14, 2024Updated last year
- [ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregation☆22May 29, 2025Updated 9 months ago
- Model explanation provides the ability to interpret the effect of the predictors on the composition of an individual score.☆13Jan 21, 2021Updated 5 years ago
- A repo to keep all resources about interpretability in NLP organised and up to date☆12Nov 22, 2020Updated 5 years ago
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆59Feb 10, 2025Updated last year
- SemBleu: A Robust Metric for AMR Parsing Evaluation☆12Feb 22, 2021Updated 5 years ago
- The repo of the Doc2SoarGraph framework☆10Sep 17, 2024Updated last year
- ☆10Jun 12, 2019Updated 6 years ago
- The official implementation of the paper "Self-Updatable Large Language Models by Integrating Context into Model Parameters"☆15May 18, 2025Updated 9 months ago
- ☆18Jun 23, 2025Updated 8 months ago
- This is a project based on opencv-python which estimates height of an object based upon its picture. It uses a the height reference of a …☆10Dec 11, 2020Updated 5 years ago
- Code for Rethinking Prompt Optimizers: From Prompt Merits to Optimization☆12Jan 12, 2026Updated last month
- Scikit-learn vectorizer implementing "A simple but tough-to-beat baseline for sentence embeddings." by Arora, Sanjeev, Yingyu Liang, and …☆12Apr 1, 2018Updated 7 years ago
- 🚀 LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.☆13Jul 12, 2025Updated 7 months ago
- Stack & Orchestrate MCP Tools — The Scikit-Learn-Pipeline Way , For LLMs☆16Sep 20, 2025Updated 5 months ago
- AloePlayer: a cross-platform local media player.☆17Jan 24, 2026Updated last month
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 2 years ago
- ☆19Jul 21, 2025Updated 7 months ago