google / gencLinks
☆63Updated last month
Alternatives and similar repositories for genc
Users that are interested in genc are comparing it to the libraries listed below
Sorting:
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆385Updated 4 months ago
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆93Updated this week
- xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerat…☆147Updated this week
- The source of LMSYS website and blogs☆66Updated this week
- Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench☆182Updated 5 months ago
- Testing framework for Deep Learning models (Tensorflow and PyTorch) on Google Cloud hardware accelerators (TPU and GPU)☆65Updated 4 months ago
- ☆177Updated last year
- Generative AI Language (PaLM2 + Langchain) Workshop sample codes☆77Updated last year
- A simplified and automated orchestration workflow to perform ML end-to-end (E2E) model tests and benchmarking on Cloud VMs across differe…☆53Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆90Updated this week
- ☆244Updated 8 months ago
- Google TPU optimizations for transformers models☆121Updated 9 months ago
- ☆471Updated last year
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆196Updated this week
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆199Updated this week
- Hugging Face Deep Learning Containers (DLCs) for Google Cloud☆153Updated 5 months ago
- ☆145Updated 2 weeks ago
- Self-host LLMs with vLLM and BentoML☆152Updated 3 weeks ago
- ☆58Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- 🦅🔗 Building FlyteGPT on Flyte with LangChain☆30Updated last year
- Tutorial for building LLM router☆231Updated last year
- Recipes and resources for building, deploying, and fine-tuning generative AI with Fireworks.☆124Updated last week
- Transformer GPU VRAM estimator☆67Updated last year
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆76Updated last month
- Tutorial to get started with SkyPilot!☆57Updated last year
- ☆37Updated last week
- Fine-tune an LLM to perform batch inference and online serving.☆113Updated 4 months ago
- AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kub…☆324Updated 4 months ago
- Pipeline is an open source python SDK for building AI/ML workflows☆138Updated last year