Pretrain, finetune and serve LLMs on Intel platforms with Ray
☆130Sep 23, 2025Updated 8 months ago
Alternatives and similar repositories for llm-on-ray
Users that are interested in llm-on-ray are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆14Jan 8, 2026Updated 4 months ago
- ☆15Mar 3, 2025Updated last year
- RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.☆371Apr 10, 2026Updated last month
- RayLLM - LLMs on Ray (Archived). Read README for more info.☆1,267Mar 13, 2025Updated last year
- Intel® End-to-End AI Optimization Kit☆31Jul 18, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…☆21Mar 15, 2024Updated 2 years ago
- Implementation of algorithms for memory optimized deep neural network training☆10Jul 23, 2020Updated 5 years ago
- AutoML 2024: HPOD: Hyperparameter Optimization for Unsupervised Outlier Detection☆13Jul 12, 2024Updated last year
- A modular acceleration toolkit for big data analytic engines☆66May 6, 2024Updated 2 years ago
- oneCCL Bindings for Pytorch* (deprecated)☆104Dec 31, 2025Updated 4 months ago
- GLake: optimizing GPU memory management and IO transmission.☆501Mar 24, 2025Updated last year
- ☆48Jun 27, 2024Updated last year
- ☆128Dec 24, 2024Updated last year
- A suite of representative serverless cloud-agnostic (i.e., dockerized) benchmarks☆62May 17, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆218Sep 21, 2024Updated last year
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆134Feb 22, 2024Updated 2 years ago
- Serving multiple LoRA finetuned LLM as one☆1,160May 8, 2024Updated 2 years ago
- Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectu…☆26Aug 7, 2024Updated last year
- Large language model fine-tuning capabilities based on cloud native and distributed computing.☆92Feb 22, 2024Updated 2 years ago
- oneAPI Collective Communications Library (oneCCL)☆264May 13, 2026Updated last week
- A toolkit to run Ray applications on Kubernetes☆2,508Updated this week
- ☆10Mar 13, 2023Updated 3 years ago
- Machine Learning Inference Graph Spec☆21Jul 27, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆17Oct 9, 2023Updated 2 years ago
- ☆13Jan 7, 2025Updated last year
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆78Apr 6, 2024Updated 2 years ago
- GenCoG: A DSL-Based Approach to Generating Computation Graphs for TVM Testing (ISSTA‘23)☆17Jul 19, 2023Updated 2 years ago
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,179Oct 8, 2024Updated last year
- Large Language Model Text Generation Inference on Habana Gaudi☆34Mar 20, 2025Updated last year
- Resources regarding evML (edge verified machine learning)☆23Jan 4, 2025Updated last year
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆120Mar 13, 2024Updated 2 years ago
- paper and code for New Directions in Cloud Programming, CIDR 2021☆11Feb 17, 2021Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆65Dec 30, 2024Updated last year
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆313Jul 16, 2025Updated 10 months ago
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆76May 20, 2026Updated last week
- Perplexity GPU Kernels☆576Nov 7, 2025Updated 6 months ago
- wirefisher: eBPF-powered traffic monitoring and control with precise per-process, IP, and port-level filtering, plus built-in rate limiti…☆38Dec 26, 2025Updated 5 months ago
- FlashInfer: Kernel Library for LLM Serving☆5,666Updated this week
- Mirror of Plan 9 4th Edition from p9f☆14Mar 23, 2021Updated 5 years ago