bentoml / OpenLLMLinks
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
☆11,965Updated last week
Alternatives and similar repositories for OpenLLM
Users that are interested in OpenLLM are comparing it to the libraries listed below
Sorting:
- Open-source search and retrieval database for AI applications.☆24,734Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,778Updated last year
- Large Language Model Text Generation Inference☆10,684Updated 2 weeks ago
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆7,858Updated 4 months ago
- OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset☆7,528Updated 2 years ago
- A guidance language for controlling large language models.☆20,971Updated 2 weeks ago
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆7,142Updated last year
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,592Updated last week
- Python bindings for llama.cpp☆9,800Updated 3 months ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆39,290Updated 6 months ago
- Universal LLM Deployment Engine with ML Compilation☆21,691Updated last week
- LlamaIndex is the leading framework for building LLM-powered agents over your data.☆45,661Updated this week
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,469Updated 6 months ago
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,843Updated last year
- Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, o…☆9,046Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆18,067Updated last month
- Go ahead and axolotl questions☆10,911Updated this week
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,660Updated last month
- Structured Outputs☆13,025Updated this week
- High-speed Large Language Model Serving for Local Deployment☆8,434Updated 4 months ago
- Open source codebase powering the HuggingChat app☆10,321Updated this week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆32,131Updated this week
- A language for constraint-guided and efficient LLM programming.☆4,091Updated 6 months ago
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…☆6,087Updated 5 months ago
- Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.☆8,161Updated this week
- High-performance In-browser LLM Inference Engine☆16,885Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆64,758Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,000Updated this week
- Letta is the platform for building stateful agents: open AI with advanced memory that can learn and self-improve over time.☆19,380Updated last week
- Instruct-tune LLaMA on consumer hardware☆18,983Updated last year