microsoft / LLMLingua
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
☆4,653Updated this week
Related projects ⓘ
Alternatives and complementary repositories for LLMLingua
- Supercharge Your LLM Application Evaluations 🚀☆7,261Updated this week
- Tools for merging pretrained large language models.☆4,816Updated 2 weeks ago
- SGLang is a fast serving framework for large language models and vision language models.☆6,127Updated this week
- Structured Text Generation☆9,487Updated this week
- Go ahead and axolotl questions☆7,930Updated this week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!☆3,256Updated 3 months ago
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆13,971Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,057Updated 2 months ago
- A blazing fast inference solution for text embeddings models☆2,846Updated 2 weeks ago
- [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild☆3,996Updated this week
- Large Language Model Text Generation Inference☆9,122Updated this week
- A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain☆3,455Updated 8 months ago
- The LLM Evaluation Framework☆3,696Updated this week
- DSPy: The framework for programming—not prompting—language models☆18,885Updated this week
- Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Ge…☆4,784Updated this week
- Letta (formerly MemGPT) is a framework for creating LLM services with memory.☆12,838Updated this week
- A unified evaluation framework for large language models☆2,465Updated 3 weeks ago
- NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.☆4,190Updated this week
- Adding guardrails to large language models.☆4,127Updated this week
- Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""☆3,639Updated 3 weeks ago
- An awesome & curated list of best LLMOps tools for developers☆4,021Updated this week
- structured outputs for llms☆8,225Updated this week
- Harness LLMs with Multi-Agent Programming☆2,664Updated this week
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆9,182Updated this week
- Build resilient language agents as graphs.☆6,715Updated this week
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆3,680Updated this week
- Superfast AI decision making and intelligent processing of multi-modal data.☆2,115Updated this week
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,529Updated 4 months ago
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Llam…☆6,598Updated this week
- Parse files for optimal RAG☆3,173Updated last week