Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
☆311Feb 14, 2025Updated last year
Alternatives and similar repositories for gisting
Users that are interested in gisting are comparing it to the libraries listed below
Sorting:
- [EMNLP 2023] Adapting Language Models to Compress Long Contexts☆333Sep 9, 2024Updated last year
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆63Apr 18, 2024Updated last year
- ☆306Jul 10, 2025Updated 8 months ago
- ☆18Dec 2, 2024Updated last year
- The repo for In-context Autoencoder☆166May 11, 2024Updated last year
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆25Nov 6, 2023Updated 2 years ago
- Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)☆18Jun 12, 2024Updated last year
- ☆10Feb 12, 2024Updated 2 years ago
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆202Jun 22, 2023Updated 2 years ago
- [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.☆507Aug 1, 2024Updated last year
- Learning adapter weights from task descriptions☆19Nov 12, 2023Updated 2 years ago
- Leveraging passage embeddings for efficient listwise reranking with large language models.☆50Dec 7, 2024Updated last year
- The official repo for "LLoCo: Learning Long Contexts Offline"☆118Jun 15, 2024Updated last year
- Understand and test language model architectures on synthetic tasks.☆262Mar 16, 2026Updated last week
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 8 months ago
- ☆14Oct 3, 2024Updated last year
- Companion repository to "Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models"☆14May 31, 2023Updated 2 years ago
- Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).☆26Aug 25, 2024Updated last year
- YaRN: Efficient Context Window Extension of Large Language Models☆1,685Apr 17, 2024Updated last year
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆209May 24, 2023Updated 2 years ago
- Salesforce open-source LLMs with 8k sequence length.☆726Jan 31, 2025Updated last year
- [ACL 2025 Main] Repository for the paper: 500xCompressor: Generalized Prompt Compression for Large Language Models☆56Mar 9, 2026Updated last week
- ☆21Jan 16, 2025Updated last year
- Official Implementation for [ICLR26] DefensiveKV: Taming the Fragility of KV Cache Eviction in LLM Inference☆30Mar 15, 2026Updated last week
- Code repository for the c-BTM paper☆108Sep 26, 2023Updated 2 years ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Jan 4, 2024Updated 2 years ago
- Aioli: A unified optimization framework for language model data mixing☆32Jan 17, 2025Updated last year
- LongBench v2 and LongBench (ACL 25'&24')☆1,122Jan 15, 2025Updated last year
- Running inference on the ZeroSCROLLS benchmark☆20Apr 18, 2024Updated last year
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆241Sep 2, 2025Updated 6 months ago
- Prompt programming with FMs.☆444Jul 22, 2024Updated last year
- Tools for understanding how transformer predictions are built layer-by-layer☆574Aug 7, 2025Updated 7 months ago
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆16Feb 18, 2022Updated 4 years ago
- This repository open-sources our GEC system submitted by THU KELab (sz) in the CCL2023-CLTC Track 1: Multidimensional Chinese Learner Tex…☆15Nov 25, 2023Updated 2 years ago
- ☆95Dec 19, 2024Updated last year
- Simple next-token-prediction for RLHF☆229Sep 30, 2023Updated 2 years ago
- Benchmarking large language models' complex reasoning ability with chain-of-thought prompting☆2,769Aug 4, 2024Updated last year
- Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)☆11Jun 20, 2025Updated 9 months ago
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆173Jul 4, 2024Updated last year