Large Language Model Text Generation Inference on Habana Gaudi
β34Mar 20, 2025Updated last year
Alternatives and similar repositories for tgi-gaudi
Users that are interested in tgi-gaudi are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Easy and lightning fast training of π€ Transformers on Habana Gaudi processor (HPU)β211Updated this week
- Automatically derive Python dunder methods for your Rust codeβ26May 26, 2026Updated 3 weeks ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open dataβ24Jul 30, 2024Updated last year
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.β95May 28, 2026Updated 2 weeks ago
- AI Energy Score: Initiative to establish comparable energy efficiency ratings for AI models.β39Dec 2, 2025Updated 6 months ago
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- GenAI components at micro-service level; GenAI service composer to create mega-serviceβ196Jun 4, 2026Updated last week
- Reference models for Intel(R) Gaudi(R) AI Acceleratorβ171Jan 8, 2026Updated 5 months ago
- Nightly release storeβ23Updated this week
- Chunk Dedupe Estimationβ20Nov 5, 2024Updated last year
- Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Openβ¦β735Jun 5, 2026Updated last week
- Mini-Engine Demonstration of Combining XeSS with VRS Tier 2.β14Jan 26, 2026Updated 4 months ago
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudiβ43Feb 3, 2025Updated last year
- Explainable AI Tooling (XAI). XAI is used to discover and explain a model's prediction in a way that is interpretable to the user. Relevaβ¦β39Sep 22, 2025Updated 8 months ago
- β216Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Intel Gaudi's Megatron DeepSpeed Large Language Models for trainingβ18Dec 19, 2024Updated last year
- Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretrainingβ13Oct 22, 2021Updated 4 years ago
- β18Nov 23, 2016Updated 9 years ago
- Helper Files for IDCβ44Oct 23, 2023Updated 2 years ago
- Test-time-training on nearest neighbors for large language modelsβ50Apr 18, 2024Updated 2 years ago
- Peer into the Latticeβ21Nov 19, 2015Updated 10 years ago
- TPU inference for vLLM, with unified JAX and PyTorch support.β349Updated this week
- Software kit for Qualcomm Cloud AI 100β19Dec 15, 2025Updated 6 months ago
- β20Oct 5, 2025Updated 8 months ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- β11Jun 29, 2022Updated 3 years ago
- InnerEye dataset creation tool for InnerEye-DeepLearning library. Transforms DICOM data into mask for training Deep Learning models.β21Mar 21, 2024Updated 2 years ago
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.β13Sep 13, 2024Updated last year
- A huge dataset for Document Visual Question Answeringβ22Jul 29, 2024Updated last year
- β27Jun 5, 2026Updated last week
- β20Aug 1, 2024Updated last year
- Full End-to-End examples showing how to use First-gen Gaudi and Gaudi2 in common use casesβ13Dec 2, 2024Updated last year
- Sample Callback Server written in Nodeβ12Sep 22, 2018Updated 7 years ago
- eBPF tool to collect BOLT profileβ14Apr 9, 2026Updated 2 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Automatically check repositories health and quality and build reports that help us understand the current state of Sauce Labs repositorieβ¦β13Apr 10, 2023Updated 3 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ30Updated this week
- The official Node SDK for Scale AI, the data platform for AIβ16Jan 10, 2024Updated 2 years ago
- LeetCode plugin code debuging template.β13Apr 11, 2025Updated last year
- Gets an auth token for a repo via a GitHub app installationβ16May 1, 2026Updated last month
- β15Mar 3, 2025Updated last year
- Minimal implementation of a Byte Pair Encoding (BPE) tokenizer in Zigβ14Apr 7, 2025Updated last year