IBM / vllmLinks
vLLM with support for span semantics
☆21Updated last week
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below
Sorting:
- A collection of all available inference solutions for the LLMs☆93Updated 9 months ago
- AirLLM 70B inference with single 4GB GPU☆14Updated 5 months ago
- EXO Gym is an open-source Python toolkit that facilitates distributed AI research.☆87Updated 2 weeks ago
- ☆68Updated 6 months ago
- Transformer GPU VRAM estimator☆67Updated last year
- Benchmarking tool for assessing LLM models' performance across different hardwares☆17Updated 2 years ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆114Updated 4 months ago
- Google TPU optimizations for transformers models☆125Updated 10 months ago
- Tutorial to get started with SkyPilot!☆58Updated last year
- LM engine is a library for pretraining/finetuning LLMs☆77Updated this week
- Small, simple agent task environments for training and evaluation☆19Updated last year
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 3 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆89Updated last week
- [⛔️ DEPRECATED] Friendli: the fastest serving engine for generative AI☆49Updated 5 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated this week
- An HTTP service intended as a backend for an LLM that can run arbitrary pieces of Python code.☆69Updated 3 months ago
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆148Updated 2 years ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆272Updated this week
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 8 months ago
- A Lossless Compression Library for AI pipelines☆289Updated 5 months ago
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang/tree/main/docs.☆92Updated last week
- Simple high-throughput inference library☆152Updated 7 months ago
- Self-host LLMs with vLLM and BentoML☆161Updated 3 weeks ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆54Updated 3 months ago
- ☆68Updated last year
- Benchmark and optimize LLM inference across frameworks with ease☆150Updated 3 months ago
- Route LLM requests to the best model for the task at hand.☆144Updated this week
- Example implementation of Iteration of Tought - Gives a star if you like the project☆41Updated 11 months ago
- A repository of Python scripts to scrape code contents of the public repositories of `huggingface`.☆53Updated last year
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆212Updated this week