Large Language Model Text Generation Inference on Habana Gaudi
☆34Mar 20, 2025Updated last year
Alternatives and similar repositories for tgi-gaudi
Users that are interested in tgi-gaudi are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A high-throughput and memory-efficient inference and serving engine for LLMs☆85Updated this week
- ☆22Apr 7, 2026Updated last week
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆14Jan 8, 2026Updated 3 months ago
- ☆162Updated this week
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Jul 30, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆93Apr 7, 2026Updated last week
- 🤗 Optimum Intel: Accelerate inference with Intel optimization tools☆565Updated this week
- AI Energy Score: Initiative to establish comparable energy efficiency ratings for AI models.☆39Dec 2, 2025Updated 4 months ago
- GenAI components at micro-service level; GenAI service composer to create mega-service☆195Apr 7, 2026Updated last week
- Github action to connect to tailscale☆20Mar 10, 2026Updated last month
- Reference models for Intel(R) Gaudi(R) AI Accelerator☆171Jan 8, 2026Updated 3 months ago
- Nightly release store☆23Apr 9, 2026Updated last week
- A framework for few-shot evaluation of language models.☆36Apr 3, 2026Updated last week
- Chunk Dedupe Estimation☆20Nov 5, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open…☆728Updated this week
- Mini-Engine Demonstration of Combining XeSS with VRS Tier 2.☆14Jan 26, 2026Updated 2 months ago
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://dev…☆63Sep 18, 2025Updated 6 months ago
- Explainable AI Tooling (XAI). XAI is used to discover and explain a model's prediction in a way that is interpretable to the user. Releva…☆39Sep 22, 2025Updated 6 months ago
- ☆190Updated this week
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆18Dec 19, 2024Updated last year
- ☆14Jan 21, 2025Updated last year
- Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining☆13Oct 22, 2021Updated 4 years ago
- ☆13Feb 13, 2021Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Test-time-training on nearest neighbors for large language models☆50Apr 18, 2024Updated last year
- ☆11Nov 20, 2024Updated last year
- ☆20Oct 5, 2025Updated 6 months ago
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆13Sep 13, 2024Updated last year
- Nebula: Deep Neural Network Benchmarks in C++☆13Jan 2, 2025Updated last year
- A huge dataset for Document Visual Question Answering☆21Jul 29, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Apr 10, 2026Updated last week
- Sample Callback Server written in Node☆12Sep 22, 2018Updated 7 years ago
- Evaluate how vLLM and SGLang perform when running a small LLM model on a mid-range NVIDIA GPU☆21Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- An innovative library for efficient LLM inference via low-bit quantization☆352Aug 30, 2024Updated last year
- ☆29Nov 18, 2025Updated 4 months ago
- Automatically check repositories health and quality and build reports that help us understand the current state of Sauce Labs repositorie…☆13Apr 10, 2023Updated 3 years ago
- ☆59Mar 6, 2026Updated last month
- The official Node SDK for Scale AI, the data platform for AI☆16Jan 10, 2024Updated 2 years ago
- LeetCode plugin code debuging template.☆13Apr 11, 2025Updated last year
- A Prot paper related materials☆11Sep 5, 2022Updated 3 years ago