A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
☆77Apr 6, 2024Updated last year
Alternatives and similar repositories for ray_vllm_inference
Users that are interested in ray_vllm_inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RLVR Testing and Training☆23Aug 28, 2025Updated 6 months ago
- RayLLM - LLMs on Ray (Archived). Read README for more info.☆1,266Mar 13, 2025Updated last year
- MLFlow Deployment Plugin for Ray Serve☆47Apr 12, 2022Updated 3 years ago
- This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.☆455Feb 13, 2024Updated 2 years ago
- Deprecated Browserbase Python SDK☆10Nov 1, 2024Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated last year
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆130Sep 23, 2025Updated 6 months ago
- Unofficial DomoAI API☆27Jul 1, 2024Updated last year
- Karmada APIs☆15Mar 10, 2026Updated last week
- Codebase, data and models for the Headline Grouping paper at NAACL2021☆12Oct 2, 2022Updated 3 years ago
- A Rust crate offering similar functionality to the Python transformers package using Candle.☆14Nov 19, 2024Updated last year
- ☆17May 5, 2022Updated 3 years ago
- A web interface for SleekDB written in PHP☆11Jan 22, 2022Updated 4 years ago
- ☆19Jan 10, 2023Updated 3 years ago
- Ask Poddy: Run Open Source LLMs and Embeddings as OpenAI-Compatible Serverless Endpoints (Tutorial)☆11Jul 19, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- Mixpost Installation with Docker Containers☆14Mar 15, 2023Updated 3 years ago
- Build modern UIs in Jupyter with Python☆12Dec 28, 2022Updated 3 years ago
- Open source RAG with Llama Index for Japanese LLM in low resource settting☆10May 12, 2025Updated 10 months ago
- Explore Multiple Vector Databases and chat with documents on Multiple LLM models, private LLM models☆48Jun 1, 2023Updated 2 years ago
- nvidia-smi xml to json☆15May 29, 2024Updated last year
- DEPRECATED: Use provider-jet-equinix☆16Feb 16, 2024Updated 2 years ago
- Blogging with Emacs and AI☆11Jun 4, 2023Updated 2 years ago
- Newspaper Segmentation into images and text☆12Jan 11, 2019Updated 7 years ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆145Nov 13, 2025Updated 4 months ago
- ☆10Sep 5, 2024Updated last year
- My Gen AI research☆11Jun 3, 2024Updated last year
- A simple website to manage your Hyper-V VMs and IIS sites☆12Jan 19, 2023Updated 3 years ago
- Using Siamese LSTM to classify repeated quora questions. Attempted pretrained bert embeddings, Word2Vec and training own embeddings toget…☆10Aug 28, 2020Updated 5 years ago
- A testbed for agents and environments that can automatically improve models through data generation.☆28Mar 4, 2025Updated last year
- ☆12Jul 10, 2023Updated 2 years ago
- ☆18Feb 7, 2026Updated last month
- Agentkube - Run Kubernetes Like Never Before☆37Mar 1, 2026Updated 3 weeks ago
- ☆28Jul 29, 2025Updated 7 months ago
- Ray - A curated list of resources: https://github.com/ray-project/ray☆80Oct 21, 2025Updated 5 months ago
- ☆30Dec 12, 2025Updated 3 months ago
- ☆16Mar 23, 2023Updated 3 years ago
- Finetuning a codegen model with python instruction set using QLORA technique for better efficacy☆11Aug 31, 2023Updated 2 years ago
- A vllm proxy server to add security and multi model management for vllm servers☆12May 30, 2024Updated last year