AdrianBZG / LLM-distributed-finetuneLinks
Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the training on multiple AWS GPU instances
☆59Updated 2 years ago
Alternatives and similar repositories for LLM-distributed-finetune
Users that are interested in LLM-distributed-finetune are comparing it to the libraries listed below
Sorting:
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated last month
- Benchmark suite for LLMs from Fireworks.ai☆83Updated 2 weeks ago
- experiments with inference on llama☆103Updated last year
- ☆122Updated last year
- ☆57Updated last year
- batched loras☆347Updated 2 years ago
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System☆149Updated last year
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆78Updated last year
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆317Updated last month
- Manage scalable open LLM inference endpoints in Slurm clusters☆276Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆256Updated 2 years ago
- This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.☆442Updated last year
- Ray - A curated list of resources: https://github.com/ray-project/ray☆72Updated 3 weeks ago
- A high-performance inference system for large language models, designed for production environments.☆482Updated last week
- ☆56Updated 11 months ago
- ☆472Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆139Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆272Updated 2 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆266Updated last year
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆299Updated last week
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆245Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆93Updated this week
- Experiments on speculative sampling with Llama models☆126Updated 2 years ago
- ReLM is a Regular Expression engine for Language Models☆107Updated 2 years ago
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆145Updated last year
- Website for hosting the Open Foundation Models Cheat Sheet.☆268Updated 6 months ago
- ☆316Updated last year
- ☆312Updated this week
- Comparison of Language Model Inference Engines☆235Updated 11 months ago
- Fine-tune an LLM to perform batch inference and online serving.☆113Updated 5 months ago