google / gencLinks
☆63Updated 4 months ago
Alternatives and similar repositories for genc
Users that are interested in genc are comparing it to the libraries listed below
Sorting:
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆109Updated last week
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆398Updated 6 months ago
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆158Updated this week
- ☆149Updated this week
- Google TPU optimizations for transformers models☆133Updated 3 weeks ago
- ☆186Updated 2 years ago
- Hugging Face Deep Learning Containers (DLCs) for Google Cloud☆161Updated last month
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆198Updated 8 months ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆204Updated this week
- A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across differe…☆57Updated last week
- Transformer GPU VRAM estimator☆67Updated last year
- Tutorial to get started with SkyPilot!☆58Updated last year
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆79Updated 3 weeks ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆114Updated 5 months ago
- ☆475Updated 2 years ago
- Repository of model demos using TT-Buda☆63Updated 9 months ago
- xet client tech, used in huggingface_hub☆372Updated 2 weeks ago
- Fine-tune an LLM to perform batch inference and online serving.☆115Updated 7 months ago
- Generative AI Language (PaLM2 + Langchain) Workshop sample codes☆78Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated last week
- ☆68Updated last week
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 3 months ago
- ScalarLM - a unified training and inference stack☆94Updated last month
- Write a fast kernel and run it on Discord. See how you compare against the best!☆66Updated 2 weeks ago
- Self-host LLMs with vLLM and BentoML☆163Updated last month
- 🦅🔗 Building FlyteGPT on Flyte with LangChain☆30Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆202Updated 3 months ago
- Home for OctoML PyTorch Profiler☆114Updated 2 years ago
- 🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…☆325Updated 3 months ago
- This repository hosts code that supports the testing infrastructure for the PyTorch organization. For example, this repo hosts the logic …☆104Updated this week