IBM / text-generation-inferenceLinks
IBM development fork of https://github.com/huggingface/text-generation-inference
β60Updated last month
Alternatives and similar repositories for text-generation-inference
Users that are interested in text-generation-inference are comparing it to the libraries listed below
Sorting:
- Benchmark suite for LLMs from Fireworks.aiβ76Updated 2 weeks ago
- π Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.β47Updated this week
- Inference server benchmarking toolβ73Updated last month
- InstructLab Training Library - Efficient Fine-Tuning with Message-Format Dataβ43Updated this week
- LM engine is a library for pretraining/finetuning LLMsβ57Updated last week
- Python library for Synthetic Data Generationβ42Updated this week
- β55Updated 9 months ago
- β62Updated 2 months ago
- π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data β¦β199Updated this week
- vLLM adapter for a TGIS-compatible gRPC server.β32Updated this week
- Large Language Model Text Generation Inference on Habana Gaudiβ33Updated 3 months ago
- β221Updated this week
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β77Updated 8 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)β119Updated this week
- β34Updated last month
- β155Updated this week
- experiments with inference on llamaβ104Updated last year
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data generaβ¦β69Updated this week
- β39Updated 11 months ago
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β137Updated 10 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language modelsβ78Updated last month
- β66Updated last year
- Data preparation code for Amber 7B LLMβ91Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMsβ264Updated 8 months ago
- Easy and Efficient Quantization for Transformersβ199Updated 4 months ago
- codebase release for EMNLP2023 paper publicationβ19Updated last month
- Train, tune, and infer Bamba modelβ127Updated 2 weeks ago
- Efficient and Scalable Estimation of Tool Representations in Vector Spaceβ23Updated 9 months ago
- β41Updated last week
- β54Updated 7 months ago