asprenger / ray_vllm_inferenceLinks
A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
☆67Updated last year
Alternatives and similar repositories for ray_vllm_inference
Users that are interested in ray_vllm_inference are comparing it to the libraries listed below
Sorting:
- ☆54Updated 7 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆129Updated last month
- vLLM Router☆29Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆209Updated 10 months ago
- Benchmark suite for LLMs from Fireworks.ai☆76Updated 2 weeks ago
- ☆267Updated last week