asprenger / ray_vllm_inference
A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
☆60Updated 9 months ago
Alternatives and similar repositories for ray_vllm_inference:
Users that are interested in ray_vllm_inference are comparing it to the libraries listed below
- ☆215Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆108Updated 2 months ago
- OpenAI compatible API for TensorRT LLM triton backend☆186Updated 5 months ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆183Updated last month
- ☆17Updated last year
- A collection of all available inference solutions for the LLMs