microsoft / LLMLinguaLinks

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

☆5,499

Alternatives and similar repositories for LLMLingua

Users that are interested in LLMLingua are comparing it to the libraries listed below

Sorting:

Eladlev / AutoPrompt
A framework for prompt tuning using Intent-based Prompt Calibration
☆2,805Updated 6 months ago
aurelio-labs / semantic-router
Superfast AI decision making and intelligent processing of multi-modal data.
☆2,861Updated 3 weeks ago
MeetKai / functionary
Chat language model that can use tools and interpret the results
☆1,585Updated last month
AnswerDotAI / RAGatouille
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…
☆3,723Updated 5 months ago
NVIDIA-NeMo / Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
☆5,159Updated last week
lm-sys / RouteLLM
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality
☆4,347Updated last year
guardrails-ai / guardrails
Adding guardrails to large language models.
☆5,842Updated this week
microsoft / promptbench
A unified evaluation framework for large language models
☆2,736Updated 2 weeks ago
noamgat / lm-format-enforcer
Enforce the output format (JSON Schema, Regex etc) of a language model
☆1,942Updated 2 months ago
AkariAsai / self-rag
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…
☆2,225Updated last year
predibase / lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
☆3,518Updated 5 months ago
SqueezeAILab / LLMCompiler
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
☆1,771Updated last year
gkamradt / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆2,056Updated last year
arcee-ai / mergekit
Tools for merging pretrained large language models.
☆6,394Updated last month
explodinggradients / ragas
Supercharge Your LLM Application Evaluations 🚀
☆11,136Updated last week
ShishirPatil / gorilla
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
☆12,500Updated this week
langroid / langroid
Harness LLMs with Multi-Agent Programming
☆3,730Updated last week
Unstructured-IO / unstructured
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean…
☆12,980Updated last week
dottxt-ai / outlines
Structured Outputs
☆12,739Updated last week
zilliztech / GPTCache
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
☆7,800Updated 3 months ago
argilla-io / distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…
☆2,903Updated this week
truera / trulens
Evaluation and Tracking for LLM Experiments and AI Agents
☆2,862Updated this week
run-llama / llama_deploy
Deploy your agentic worfklows to production
☆2,059Updated last month
axolotl-ai-cloud / axolotl
Go ahead and axolotl questions
☆10,673Updated this week
mistralai / mistral-finetune
☆3,031Updated last year
IntelLabs / fastRAG
Efficient Retrieval Augmentation and Generation Framework
☆1,735Updated 9 months ago
567-labs / instructor
structured outputs for llms
☆11,686Updated this week
mit-han-lab / streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,094Updated last year
gabrielchua / RAGxplorer
Open-source tool to visualise your RAG 🔮
☆1,171Updated 9 months ago
xlang-ai / instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
☆2,015Updated 9 months ago