asprenger / ray_vllm_inference

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
54Updated 7 months ago

Related projects

Alternatives and complementary repositories for ray_vllm_inference