microsoft / LLMLingua
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
☆4,435Updated 3 weeks ago
Related projects: ⓘ
- Go ahead and axolotl questions☆7,554Updated this week
- Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines☆6,560Updated this week
- Tools for merging pretrained large language models.☆4,501Updated this week
- Build resilient language agents as graphs.☆5,662Updated this week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!☆2,884Updated last month
- SGLang is a fast serving framework for large language models and vision language models.☆5,121Updated this week
- Structured Text Generation☆8,241Updated this week
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆8,446Updated this week
- Build Conversational AI in minutes ⚡️☆6,762Updated this week
- Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker,…☆12,231Updated this week
- structured outputs for llms☆7,529Updated this week
- [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild☆3,900Updated 2 months ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆2,817Updated 2 weeks ago
- Large Language Model Text Generation Inference☆8,762Updated this week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Llam…☆5,596Updated this week
- Adding guardrails to large language models.☆3,873Updated this week
- NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.☆3,978Updated this week
- ☆3,912Updated 5 months ago
- A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain☆3,437Updated 6 months ago
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆6,537Updated 2 months ago
- An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents☆5,162Updated last week
- Robust recipes to align language models with human and AI preferences☆4,481Updated 3 weeks ago
- Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Ge…☆4,170Updated this week
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆7,085Updated last week
- DSPy: The framework for programming—not prompting—foundation models☆16,773Updated this week
- Official implementation for the paper: "Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering""☆3,393Updated last month
- A blazing fast inference solution for text embeddings models☆2,599Updated this week
- A code-first agent framework for seamlessly planning and executing data analytics tasks.☆5,194Updated this week
- Create LLM agents with long-term memory and custom tools 📚🦙☆11,378Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆9,780Updated this week