IBM / text-generation-inferenceLinks

IBM development fork of https://github.com/huggingface/text-generation-inference

☆62

Alternatives and similar repositories for text-generation-inference

Users that are interested in text-generation-inference are comparing it to the libraries listed below

Sorting:

fw-ai / benchmark
Benchmark suite for LLMs from Fireworks.ai
☆84Updated last week
opendatahub-io / vllm-tgis-adapter
vLLM adapter for a TGIS-compatible gRPC server.
☆45Updated this week
foundation-model-stack / fms-hf-tuning
🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
☆52Updated last week
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆267Updated last year
vllm-project / speculators
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
☆132Updated last week
mani-kantap / llm-inference-solutions
A collection of all available inference solutions for the LLMs
☆93Updated 9 months ago
run-ai / runai-model-streamer
☆267Updated last week
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆139Updated last year
open-lm-engine / lm-engine
LM engine is a library for pretraining/finetuning LLMs
☆77Updated this week
Preemo-Inc / text-generation-inference
☆198Updated last year
guidance-ai / jsonschemabench
☆67Updated 5 months ago
hamelsmu / llama-inference
experiments with inference on llama
☆103Updated last year
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆136Updated 6 months ago
huggingface / optimum-tpu
Google TPU optimizations for transformers models
☆123Updated 10 months ago
snowflakedb / ArcticInference
ArcticInference: vLLM plugin for high-throughput, low-latency inference
☆327Updated this week
NetEase-FuXi / EETQ
Easy and Efficient Quantization for Transformers
☆203Updated 5 months ago
huggingface / inference-benchmarker
Inference server benchmarking tool
☆130Updated 2 months ago
LLM360 / amber-data-prep
Data preparation code for Amber 7B LLM
☆93Updated last year
coreweave / ml-containers
☆42Updated last week
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆262Updated this week
EmbeddedLLM / vllm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
☆93Updated this week
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆38Updated 7 months ago
Narsil / bloomserver
☆39Updated 3 years ago
LLM360 / crystalcoder-train
Pre-training code for CrystalCoder 7B LLM
☆55Updated last year
bentoml / BentoVLLM
Self-host LLMs with vLLM and BentoML
☆161Updated last week
IBM / unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …
☆212Updated 2 weeks ago
huggingface / kernel-builder
👷 Build compute kernels
☆190Updated this week
vllm-project / dashboard
vLLM performance dashboard
☆38Updated last year
Snowflake-Labs / vllm
☆16Updated last week
substratusai / vllm-docker
☆64Updated 8 months ago