Lightning-AI / LitServeLinks
Build custom inference engines for models, agents, multi-modal systems, RAG, pipelines and more.
☆3,711Updated last week
Alternatives and similar repositories for LitServe
Users that are interested in LitServe are comparing it to the libraries listed below
Sorting:
- AdalFlow: The library to build & auto-optimize LLM applications.☆3,873Updated last month
- Fast State-of-the-Art Static Embeddings☆1,900Updated last week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆3,533Updated 6 months ago
- RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry☆4,284Updated 2 months ago
- NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extra…☆2,760Updated this week
- Knowledge Agents and Management in the Cloud☆4,204Updated this week
- A system for agentic LLM-powered data processing and ETL☆3,065Updated this week
- A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data☆1,515Updated 6 months ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,761Updated 6 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embedding☆2,506Updated this week
- The python library for real-time communication☆4,403Updated 2 months ago
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,617Updated 2 months ago
- The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.☆2,321Updated this week
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,572Updated 5 months ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,787Updated 3 weeks ago
- Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali☆2,549Updated last week
- ETL, Analytics, Versioning for Unstructured Data☆2,697Updated last week
- streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL☆2,640Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆2,932Updated this week
- ☆2,069Updated last week
- ☆3,038Updated last year
- Deploy your agentic worfklows to production☆2,061Updated 2 months ago
- ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.☆1,455Updated 2 months ago
- LLM abstractions that aren't obstructions☆1,295Updated last week
- 🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide…☆1,403Updated 3 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆828Updated 9 months ago
- A fast multimodal LLM for real-time voice☆4,258Updated 2 months ago
- dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or o…☆1,955Updated last week
- Kickstart your LLMOps initiative with a flexible, robust, and productive Python package.☆885Updated 9 months ago
- PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily wri…☆1,421Updated this week