A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
☆79Apr 6, 2024Updated 2 years ago
Alternatives and similar repositories for ray_vllm_inference
Users that are interested in ray_vllm_inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14May 25, 2023Updated 3 years ago
- RayLLM - LLMs on Ray (Archived). Read README for more info.☆1,263Mar 13, 2025Updated last year
- This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.☆458Feb 13, 2024Updated 2 years ago
- LLM query engine to retrieve augmented responses from json files.☆15Oct 12, 2023Updated 2 years ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆130Sep 23, 2025Updated 9 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- FlexOS is a Unikraft-based OS allowing users to easily specialize the safety and isolation strategy at compilation time.☆24Jun 2, 2023Updated 3 years ago
- Keyword extraction using Scake, KeyBERT, Fine-tuning Transformer BERT-like models and ChatGPT.☆12May 22, 2023Updated 3 years ago
- Karmada APIs☆15Mar 10, 2026Updated 3 months ago
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 4 years ago
- A web interface for SleekDB written in PHP☆11Jan 22, 2022Updated 4 years ago
- ☆19Jan 10, 2023Updated 3 years ago
- Ask Poddy: Run Open Source LLMs and Embeddings as OpenAI-Compatible Serverless Endpoints (Tutorial)☆11Jul 19, 2024Updated last year
- Mixpost Installation with Docker Containers☆15Mar 15, 2023Updated 3 years ago
- end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace☆11Aug 15, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Open source RAG with Llama Index for Japanese LLM in low resource settting☆10May 12, 2025Updated last year
- Explore Multiple Vector Databases and chat with documents on Multiple LLM models, private LLM models☆48Jun 1, 2023Updated 3 years ago
- Newspaper Segmentation into images and text☆12Jan 11, 2019Updated 7 years ago
- My Gen AI research☆11Jun 3, 2024Updated 2 years ago
- ☆12Jan 20, 2023Updated 3 years ago
- A simple website to manage your Hyper-V VMs and IIS sites☆12Jan 19, 2023Updated 3 years ago
- A converter and basic tester for rwkv onnx☆44Jan 29, 2024Updated 2 years ago
- ☆14Sep 18, 2024Updated last year
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆15May 14, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆11May 22, 2021Updated 5 years ago
- [ICLR 25 Spotlight] A testbed for agents and environments that can automatically improve models through data generation.☆28Mar 4, 2025Updated last year
- Using Siamese LSTM to classify repeated quora questions. Attempted pretrained bert embeddings, Word2Vec and training own embeddings toget…☆10Aug 28, 2020Updated 5 years ago
- Injector trait as a webhook to inject data into Workload.☆15Apr 14, 2021Updated 5 years ago
- ☆17Feb 7, 2026Updated 4 months ago
- Expose a qubes vm port to the public interfaces of the sys-net vm.☆13Jun 24, 2018Updated 8 years ago
- Benchmark suite for LLMs from Fireworks.ai☆107Jun 26, 2026Updated last week
- ☆17May 5, 2022Updated 4 years ago
- A minimal Helm starter☆15Apr 5, 2017Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Host LLM via text-generation-inference☆16Dec 5, 2023Updated 2 years ago
- ☆28Jul 29, 2025Updated 11 months ago
- 3D Mesh Generation from 2D Images in Python☆13Feb 12, 2024Updated 2 years ago
- Triton backend for https://github.com/OpenNMT/CTranslate2☆11Aug 20, 2024Updated last year
- Finetuning a codegen model with python instruction set using QLORA technique for better efficacy☆11Aug 31, 2023Updated 2 years ago
- UNSUPPORTED A tool to convert and play GameMaker games in the browser☆23May 6, 2013Updated 13 years ago
- A vllm proxy server to add security and multi model management for vllm servers☆12May 30, 2024Updated 2 years ago