Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)
โ283Nov 3, 2023Updated 2 years ago
Alternatives and similar repositories for openmodelz
Users that are interested in openmodelz are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ๐๏ธ Reproducible development environment for humans and agentsโ2,209May 21, 2026Updated 3 weeks ago
- OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)โ277Oct 11, 2023Updated 2 years ago
- A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machineโ900Jun 1, 2026Updated last week
- This repository contains statistics about the AI Infrastructure products.โ17Feb 27, 2025Updated last year
- With Dejavu, you can have a perfect memory by capturing and organizing your visual recordings efficiently.โ133Sep 1, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean โข AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- This is a landscape of the infrastructure that powers the generative AI ecosystemโ157Oct 16, 2024Updated last year
- OpenDAL fsspec integrationโ37Jan 20, 2026Updated 4 months ago
- my bachelor's thesis in SJTU about https://github.com/caicloud/cycloneโ12Jan 4, 2018Updated 8 years ago
- โ19Apr 11, 2024Updated 2 years ago
- Kexplain is an interactive kubectl explainโ12Oct 23, 2023Updated 2 years ago
- Docker for Your ML/DL Models Based on OCI Artifactsโ473Jan 26, 2024Updated 2 years ago
- An awesome & curated list of best LLMOps tools for developersโ5,829May 21, 2026Updated 3 weeks ago
- EvalGPT is an code interpreter framework that utilizes large language models to automate the process of code-writing and execution, delivโฆโ250Sep 17, 2023Updated 2 years ago
- Personal Blog in github.ioโ10Feb 25, 2026Updated 3 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits โข AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Turn PostgreSQL into your search engine in a Pythonic way.โ52Aug 29, 2025Updated 9 months ago
- Custom Scheduler to deploy ML models to TRTIS for GPU Sharingโ12Apr 1, 2020Updated 6 years ago
- โ144Dec 6, 2023Updated 2 years ago
- IBM Quantum Challenge Fall 2023โ10May 23, 2023Updated 3 years ago
- An Envoy inspired, ultimate LLM-first gateway for LLM serving and downstream application developers and enterprisesโ27Apr 24, 2025Updated last year
- Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clโฆโ10,070Updated this week
- OpenAI compatible API for open source LLMsโ17Oct 30, 2023Updated 2 years ago
- RayLLM - LLMs on Ray (Archived). Read README for more info.โ1,267Mar 13, 2025Updated last year
- EpochFS is a versioned cloud file system with git-like branching, transaction support.โ17Apr 23, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer โข AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Create informative READMEs effortlessly using AI-driven templates with the README Creator powered by Language Model (LLM). Simplify documโฆโ14Aug 11, 2023Updated 2 years ago
- Scalable, Low-latency and Hybrid-enabled Vector Search in Postgres. Revolutionize Vector Search, not Database.โ2,172Feb 26, 2025Updated last year
- Benchmark results from code generation with LLMsโ17Sep 1, 2023Updated 2 years ago
- Run your deep learning workloads on Kubernetes more easily and efficiently.โ532Mar 4, 2024Updated 2 years ago
- A toolkit to run Ray applications on Kubernetesโ2,534Jun 4, 2026Updated last week
- The inference code of RVC-Boss/GPT-SoVITS that can be developer-friendly.โ16Sep 29, 2024Updated last year
- a fast cross platform AI inference engine ๐ค using Rust ๐ฆ and WebGPU ๐ฎโ468Jan 4, 2025Updated last year
- An experimental tool to modify YAMLs without losing (most of) comment lines.โ16Sep 25, 2022Updated 3 years ago
- AI-based search done rightโ20Dec 25, 2025Updated 5 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits โข AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Generic prefix tree for golangโ13Apr 25, 2025Updated last year
- A powerful prompt template engine built upon Jinjaโ12Oct 22, 2025Updated 7 months ago
- Model Deployment at Scale on Kubernetes ๐ฆ๏ธโ844May 30, 2026Updated 2 weeks ago
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.โ8,056Jul 11, 2025Updated 11 months ago
- A tricky compiler generatorโ13Nov 11, 2015Updated 10 years ago
- PostgreSQL tokenizer extension for full-text searchโ45Sep 29, 2025Updated 8 months ago
- A diverse, simple, and secure all-in-one LLMOps platformโ113Sep 21, 2024Updated last year