Lightning-AI / LitServe
Deploy high-performance AI models and inference pipelines on FastAPI with built-in batching, streaming and more.
☆3,091Updated this week
Alternatives and similar repositories for LitServe:
Users that are interested in LitServe are comparing it to the libraries listed below
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,576Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,426Updated 2 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆2,965Updated this week
- AdalFlow: The library to build & auto-optimize LLM applications.☆2,971Updated last month
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,038Updated this week
- PyTorch native post-training library☆5,154Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,036Updated 2 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆2,013Updated 3 weeks ago
- The python library for real-time communication☆3,824Updated last week
- ☆2,928Updated 7 months ago
- A PyTorch native library for large-scale model training☆3,665Updated this week
- A blazing fast inference solution for text embeddings models☆3,505Updated last week
- Deploy your agentic worfklows to production☆2,002Updated last week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,671Updated last week
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other ent…☆2,658Updated this week
- Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and …☆1,337Updated this week
- The easiest way to use Agentic RAG in any enterprise☆4,210Updated 3 months ago
- PyTorch native quantization and sparsity for training and inference☆2,015Updated this week
- NanoGPT (124M) in 3 minutes☆2,520Updated last week
- Fast State-of-the-Art Static Embeddings☆1,563Updated this week
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,474Updated last week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆6,377Updated 2 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,366Updated this week
- ☆1,656Updated this week
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆7,076Updated last month
- Composable building blocks to build Llama Apps☆7,748Updated this week
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆5,176Updated this week
- Knowledge Agents and Management in the Cloud☆3,934Updated this week
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,417Updated last month
- Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali☆2,131Updated 2 weeks ago