ray-project / ray-llmLinks
RayLLM - LLMs on Ray (Archived). Read README for more info.
☆1,264Updated 10 months ago
Alternatives and similar repositories for ray-llm
Users that are interested in ray-llm are comparing it to the libraries listed below
Sorting:
- Scale LLM Engine public repository☆820Updated this week
- LLMPerf is a library for validating and benchmarking LLMs☆1,075Updated last year
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.☆2,088Updated 6 months ago
- S-LoRA: Serving Thousands of Concurrent LoRA Adapters☆1,888Updated last year
- Serving multiple LoRA finetuned LLM as one☆1,134Updated last year
- ☆475Updated 2 years ago
- A tiny library for coding with large language models.☆1,236Updated last year
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆733Updated last year
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,876Updated last year
- A high-performance inference system for large language models, designed for production environments.☆489Updated 3 weeks ago
- Chat language model that can use tools and interpret the results☆1,592Updated last month
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,172Updated last year
- A tool for evaluating LLMs☆428Updated last year
- Distribute and run AI workloads on Kubernetes magically in Python, like PyTorch for ML infra.☆1,143Updated this week
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,811Updated last year
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,314Updated 10 months ago
- Examples on how to use LangChain and Ray☆232Updated 2 years ago
- Customizable implementation of the self-instruct paper.☆1,050Updated last year
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆972Updated last year
- The Triton TensorRT-LLM Backend☆912Updated last week
- A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine☆889Updated 2 weeks ago
- 🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring sa…☆974Updated last year
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,689Updated last year
- YaRN: Efficient Context Window Extension of Large Language Models☆1,656Updated last year
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,482Updated 8 months ago
- ☆1,026Updated 11 months ago
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,463Updated 2 years ago
- ☆470Updated 2 years ago
- Finetuning Large Language Models on One Consumer GPU in 2 Bits☆734Updated last year
- ☆1,026Updated 2 years ago