[ACL 2025 Main] Repository for the paper: 500xCompressor: Generalized Prompt Compression for Large Language Models
☆56Jun 11, 2025Updated 8 months ago
Alternatives and similar repositories for 500xCompressor
Users that are interested in 500xCompressor are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024] CompAct: Compressing Retrieved Documents Actively for Question Answering☆38Sep 20, 2024Updated last year
- FocusLLM: Scaling LLM’s Context by Parallel Decoding☆44Dec 8, 2024Updated last year
- A Mechanistic‑Interpretability study that finds the structural dynamics of Large Language Models under fine‑tuning.☆16May 30, 2025Updated 9 months ago
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 7 months ago
- ☆12Nov 15, 2022Updated 3 years ago
- [NAACL 2025 Main Selected Oral] Repository for the paper: Prompt Compression for Large Language Models: A Survey☆36May 18, 2025Updated 9 months ago
- The repo for In-context Autoencoder☆164May 11, 2024Updated last year
- Codebase for Hyperdecoders https://arxiv.org/abs/2203.08304☆14Oct 11, 2022Updated 3 years ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32May 29, 2024Updated last year
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆41Oct 17, 2023Updated 2 years ago
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆172Jul 4, 2024Updated last year
- Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".☆16Sep 15, 2024Updated last year
- ☆18Dec 2, 2024Updated last year
- AGiXT is a dynamic AI Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse A…☆23Jan 26, 2026Updated last month
- Research work aimed at addressing the problem of modeling infinite-length context☆46Dec 18, 2025Updated 2 months ago
- The evaluation framework for training-free sparse attention in LLMs☆121Jan 27, 2026Updated last month
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆44Aug 6, 2024Updated last year
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆28Jul 15, 2025Updated 7 months ago
- Official repository of paper "Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models"☆23May 27, 2025Updated 9 months ago
- ☆54Oct 29, 2024Updated last year
- ☆21Apr 17, 2025Updated 10 months ago
- Compression for Foundation Models☆35Jul 21, 2025Updated 7 months ago
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆26Jan 27, 2025Updated last year
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆37Feb 22, 2025Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆28Jul 13, 2023Updated 2 years ago
- The original Shared Recurrent Memory Transformer implementation☆33Jul 11, 2025Updated 7 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆169Jun 13, 2024Updated last year
- Code for paper 'Data-Efficient FineTuning'☆28May 24, 2023Updated 2 years ago
- This is an authors' implementation of the NIPS 2022 dataset and Benchmark Track Paper "A Comprehensive Study on Large Scale Graph Trainin…☆64Mar 8, 2023Updated 2 years ago
- ☆32May 12, 2023Updated 2 years ago
- A reimplementation of KOSMOS-1 from "Language Is Not All You Need: Aligning Perception with Language Models"☆27Mar 3, 2023Updated 3 years ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- ☆38Nov 13, 2025Updated 3 months ago
- ☆36Feb 12, 2025Updated last year
- ☆64Jul 14, 2025Updated 7 months ago
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆143Dec 4, 2024Updated last year
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆37Jan 8, 2025Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆35Mar 7, 2025Updated last year
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆90Jan 29, 2024Updated 2 years ago