Lightning-AI / LitServe
Deploy high-performance AI models and inference pipelines on FastAPI with built-in batching, streaming and more.
☆3,099Updated this week
Alternatives and similar repositories for LitServe:
Users that are interested in LitServe are comparing it to the libraries listed below
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,436Updated 3 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,580Updated this week
- AdalFlow: The library to build & auto-optimize LLM applications.☆3,015Updated last month
- Knowledge Agents and Management in the Cloud☆3,961Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,045Updated 2 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆2,032Updated 3 weeks ago
- A blazing fast inference solution for text embeddings models☆3,520Updated this week
- Blazingly fast LLM inference.☆5,568Updated this week
- A system for agentic LLM-powered data processing and ETL☆1,937Updated this week
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other ent…☆2,660Updated this week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,972Updated this week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,408Updated this week
- Deploy your agentic worfklows to production☆2,002Updated this week
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,486Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,681Updated this week
- ☆1,666Updated last week
- Fast State-of-the-Art Static Embeddings☆1,589Updated this week
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,408Updated 2 weeks ago
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.☆2,505Updated last month
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality☆3,906Updated 9 months ago
- Superfast AI decision making and intelligent processing of multi-modal data.☆2,581Updated this week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,042Updated last week
- ETL, Analytics, Versioning for Unstructured Data☆2,543Updated this week
- Everything about the SmolLM2 and SmolVLM family of models☆2,287Updated last month
- ☆2,933Updated 7 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,373Updated this week
- PyTorch native post-training library☆5,171Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆12,076Updated this week
- Tools for merging pretrained large language models.☆5,628Updated last week
- Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and …☆1,340Updated this week