IBM / text-generation-inferenceLinks
IBM development fork of https://github.com/huggingface/text-generation-inference
β61Updated 2 months ago
Alternatives and similar repositories for text-generation-inference
Users that are interested in text-generation-inference are comparing it to the libraries listed below
Sorting:
- Benchmark suite for LLMs from Fireworks.aiβ76Updated this week
- π Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.β47Updated this week
- vLLM adapter for a TGIS-compatible gRPC server.β33Updated this week
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β137Updated 11 months ago
- β228Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMsβ264Updated 9 months ago
- Train, tune, and infer Bamba modelβ130Updated last month
- experiments with inference on llamaβ104Updated last year
- Inference server benchmarking toolβ79Updated 2 months ago
- LM engine is a library for pretraining/finetuning LLMsβ59Updated this week
- β15Updated 3 months ago
- π· Build compute kernelsβ74Updated this week
- Google TPU optimizations for transformers modelsβ114Updated 5 months ago
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ211Updated this week
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.β80Updated last month
- β62Updated 3 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β33Updated 2 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data generaβ¦β73Updated last week
- A collection of reproducible inference engine benchmarksβ32Updated 2 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMsβ87Updated this week
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing Systemβ128Updated last year
- π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data β¦β204Updated this week
- β40Updated last year
- β37Updated this week
- β128Updated 3 months ago
- Pre-training code for CrystalCoder 7B LLMβ54Updated last year
- Large Language Model Text Generation Inference on Habana Gaudiβ34Updated 3 months ago
- Python library for Synthetic Data Generationβ42Updated last week
- β199Updated last year
- β39Updated 2 years ago