Lightning-AI / LitServe
Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
☆2,843Updated this week
Alternatives and similar repositories for LitServe:
Users that are interested in LitServe are comparing it to the libraries listed below
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆3,607Updated last week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,506Updated this week
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other ent…☆2,533Updated this week
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆4,431Updated this week
- Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations,…☆5,081Updated this week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality☆3,635Updated 6 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆1,782Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,266Updated last week
- A fast multimodal LLM for real-time voice☆3,589Updated last week
- ☆2,852Updated 5 months ago
- PyTorch native post-training library☆4,856Updated this week
- The easiest way to use Agentic RAG in any enterprise☆4,095Updated 3 weeks ago
- Agent Framework / shim to use Pydantic with LLMs☆6,477Updated this week
- 🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…☆1,228Updated 2 weeks ago
- Fast State-of-the-Art Static Embeddings☆1,060Updated this week
- Build and query dynamic, temporally-aware Knowledge Graphs☆1,915Updated last week
- ETL, Analytics, Versioning for Unstructured Data☆2,357Updated this week
- Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali☆1,816Updated last week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,448Updated this week
- AdalFlow: The library to build & auto-optimize LLM applications.☆2,746Updated this week
- A system for agentic LLM-powered data processing and ETL☆1,669Updated this week
- 🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library☆2,599Updated this week
- The official Python SDK for Model Context Protocol servers and clients☆1,888Updated this week
- Knowledge Agents and Management in the Cloud☆3,707Updated this week
- RAG that intelligently adapts to your use case, data, and queries☆2,924Updated last week
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,383Updated 2 weeks ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,295Updated last week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆6,826Updated 2 weeks ago
- Deploy your agentic worfklows to production☆1,964Updated this week
- Build real-time multimodal AI applications 🤖🎙️📹☆5,110Updated this week