A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
☆79Apr 6, 2024Updated 2 years ago
Alternatives and similar repositories for ray_vllm_inference
Users that are interested in ray_vllm_inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14May 25, 2023Updated 3 years ago
- ☆17Mar 24, 2023Updated 3 years ago
- A two part tutorial for Ray Core APIs and Ray Serve for Model Deployment☆21Jun 9, 2022Updated 4 years ago
- RayLLM - LLMs on Ray (Archived). Read README for more info.☆1,267Mar 13, 2025Updated last year
- This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.☆456Feb 13, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- LLM query engine to retrieve augmented responses from json files.☆15Oct 12, 2023Updated 2 years ago
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated 2 years ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆130Sep 23, 2025Updated 8 months ago
- Unofficial DomoAI API☆27Jul 1, 2024Updated last year
- yolov3 prune☆11Mar 31, 2020Updated 6 years ago
- Added functionality to the cml python package☆14Apr 1, 2026Updated 2 months ago
- Codebase, data and models for the Headline Grouping paper at NAACL2021☆12Oct 2, 2022Updated 3 years ago
- A Rust crate offering similar functionality to the Python transformers package using Candle.☆15Nov 19, 2024Updated last year
- A web interface for SleekDB written in PHP☆11Jan 22, 2022Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆19Jan 10, 2023Updated 3 years ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- OpenTracing example☆14Aug 19, 2024Updated last year
- Mixpost Installation with Docker Containers☆14Mar 15, 2023Updated 3 years ago
- OS Signal Handlers in Go☆11Jan 6, 2021Updated 5 years ago
- end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace☆11Aug 15, 2023Updated 2 years ago
- Explore Multiple Vector Databases and chat with documents on Multiple LLM models, private LLM models☆48Jun 1, 2023Updated 3 years ago
- Research that compiles.☆85Apr 19, 2026Updated last month
- automated insights for tabular data☆10Feb 10, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Github repo for Peifeng's internship project☆13Nov 7, 2023Updated 2 years ago
- A demo and tutorial for Council that implements a financial analyst agent.☆11Jun 21, 2024Updated last year
- ☆10Sep 5, 2024Updated last year
- My Gen AI research☆11Jun 3, 2024Updated 2 years ago
- ☆91Oct 2, 2023Updated 2 years ago
- A simple website to manage your Hyper-V VMs and IIS sites☆12Jan 19, 2023Updated 3 years ago
- A converter and basic tester for rwkv onnx☆44Jan 29, 2024Updated 2 years ago
- ☆14Sep 18, 2024Updated last year
- ☆11May 22, 2021Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICLR 25 Spotlight] A testbed for agents and environments that can automatically improve models through data generation.☆28Mar 4, 2025Updated last year
- Using Siamese LSTM to classify repeated quora questions. Attempted pretrained bert embeddings, Word2Vec and training own embeddings toget…☆10Aug 28, 2020Updated 5 years ago
- 4 bits quantization of LLaMa using GPTQ☆12Jun 2, 2023Updated 3 years ago
- Benchmark suite for LLMs from Fireworks.ai☆104Jun 6, 2026Updated last week
- Host LLM via text-generation-inference☆16Dec 5, 2023Updated 2 years ago
- Ray - A curated list of resources: https://github.com/ray-project/ray☆82Oct 21, 2025Updated 7 months ago
- This repo contains the code for the tutorial for using the CrewAI agent framework to generate Sales Reports based on Salesforce data☆13Mar 16, 2024Updated 2 years ago