Large Language Model Text Generation Inference on Habana Gaudi
โ34Mar 20, 2025Updated last year
Alternatives and similar repositories for tgi-gaudi
Users that are interested in tgi-gaudi are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Easy and lightning fast training of ๐ค Transformers on Habana Gaudi processor (HPU)โ207Mar 16, 2026Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMsโ85Mar 18, 2026Updated last week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.โ14Jan 8, 2026Updated 2 months ago
- Automatically derive Python dunder methods for your Rust codeโ25Jan 28, 2026Updated last month
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.โ38Aug 29, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean โข AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open dataโ23Jul 30, 2024Updated last year
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.โ90Mar 13, 2026Updated 2 weeks ago
- ๐ค Optimum Intel: Accelerate inference with Intel optimization toolsโ553Mar 20, 2026Updated last week
- GenAI components at micro-service level; GenAI service composer to create mega-serviceโ195Updated this week
- The project delivers a comprehensive full-stack solution for the Intelยฎ Enterprise AI Foundation on the OpenShift platform to provision Iโฆโ21Jan 29, 2026Updated last month
- Github action to connect to tailscaleโ19Mar 10, 2026Updated 2 weeks ago
- Reference models for Intel(R) Gaudi(R) AI Acceleratorโ170Jan 8, 2026Updated 2 months ago
- Nightly release storeโ23Mar 20, 2026Updated last week
- A framework for few-shot evaluation of language models.โ36Mar 18, 2025Updated last year
- Proton VPN Special Offer - Get 70% off โข AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Chunk Dedupe Estimationโ20Nov 5, 2024Updated last year
- Mini-Engine Demonstration of Combining XeSS with VRS Tier 2.โ14Jan 26, 2026Updated 2 months ago
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://devโฆโ64Sep 18, 2025Updated 6 months ago
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudiโ42Feb 3, 2025Updated last year
- โ182Updated this week
- TPU inference for vLLM, with unified JAX and PyTorch support.โ266Updated this week
- Test-time-training on nearest neighbors for large language modelsโ49Apr 18, 2024Updated last year
- Software kit for Qualcomm Cloud AI 100โ20Dec 15, 2025Updated 3 months ago
- โ20Oct 5, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling on Cloudways โข AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- InnerEye dataset creation tool for InnerEye-DeepLearning library. Transforms DICOM data into mask for training Deep Learning models.โ21Mar 21, 2024Updated 2 years ago
- Velocity And Luminance Adaptive Rasterizationโ16Mar 31, 2023Updated 2 years ago
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.โ13Sep 13, 2024Updated last year
- A huge dataset for Document Visual Question Answeringโ21Jul 29, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMsโ16Mar 20, 2026Updated last week
- โ24Feb 24, 2026Updated last month
- โ20Aug 1, 2024Updated last year
- Full End-to-End examples showing how to use First-gen Gaudi and Gaudi2 in common use casesโ13Dec 2, 2024Updated last year
- Sample Callback Server written in Nodeโ12Sep 22, 2018Updated 7 years ago
- Proton VPN Special Offer - Get 70% off โข AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Evaluate how vLLM and SGLang perform when running a small LLM model on a mid-range NVIDIA GPUโ20Mar 15, 2026Updated last week
- An innovative library for efficient LLM inference via low-bit quantizationโ352Aug 30, 2024Updated last year
- โ29Nov 18, 2025Updated 4 months ago
- โ14Jun 25, 2025Updated 9 months ago
- Automatically check repositories health and quality and build reports that help us understand the current state of Sauce Labs repositorieโฆโ13Apr 10, 2023Updated 2 years ago
- โ15Mar 3, 2025Updated last year
- Official PyTorch implementation of CD-MOEโ12Mar 18, 2026Updated last week