Lightning-AI / LitServe
Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
β2,705Updated this week
Alternatives and similar repositories for LitServe:
Users that are interested in LitServe are comparing it to the libraries listed below
- Run PyTorch LLMs locally on servers, desktop and mobileβ3,462Updated this week
- Parse files for optimal RAGβ3,526Updated last week
- π¦ CHONK your texts with Chonkie β¨ - The no-nonsense RAG chunking libraryβ2,249Updated this week
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundryβ3,518Updated 2 weeks ago
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other entβ¦β2,022Updated this week
- Agent Framework / shim to use Pydantic with LLMsβ5,346Updated this week
- Fast, Accurate, Lightweight Python library to make State of the Art Embeddingβ1,671Updated this week
- β2,802Updated 4 months ago
- From RAG chatbots to code assistants to complex agentic pipelines and beyond, build LLM systems that run better, faster, and cheaper withβ¦β4,430Updated this week
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-β¦β3,172Updated 4 months ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ1,093Updated 3 weeks ago
- AdalFlow: The library to build & auto-optimize LLM applications.β2,474Updated this week
- The easiest way to use Agentic RAG in any enterpriseβ3,972Updated 2 weeks ago
- The code used to train and run inference with the ColPali architecture.β1,386Updated this week
- Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpaliβ1,714Updated this week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising qualityβ3,442Updated 5 months ago
- Composable building blocks to build Llama Appsβ6,036Updated this week
- A system for agentic LLM-powered data processing and ETLβ1,514Updated this week
- π€ smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.β5,197Updated this week
- Flexible and powerful framework for managing multiple AI agents and handling complex conversationsβ3,835Updated this week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.β2,908Updated this week
- A fast multimodal LLM for real-time voiceβ2,760Updated this week
- PyTorch native post-training libraryβ4,703Updated this week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,238Updated last month
- Everything about the SmolLM & SmolLM2 family of modelsβ1,554Updated last week
- A PyTorch native library for large model trainingβ3,091Updated this week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMsβ2,311Updated this week
- File Parser optimised for LLM Ingestion with no loss π§ Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.β4,966Updated this week
- A blazing fast inference solution for text embeddings modelsβ3,043Updated last week
- π€ MLE-Agent: Your intelligent companion for seamless AI engineering and research. π Integrate with arxiv and paper with code to provideβ¦β1,192Updated last week