A high-throughput and memory-efficient inference and serving engine for LLMs
☆25Mar 5, 2026Updated 2 weeks ago
Alternatives and similar repositories for upstreaming-to-vllm
Users that are interested in upstreaming-to-vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AWS Neuron Deep Learning Containers (DLCs) are a set of Docker images for training and serving models on AWS Trainium and Inferentia inst…☆21Feb 27, 2026Updated 3 weeks ago
- This repository features Amazon SageMaker Ground Truth and explains how to ingest raw 3D point cloud data, label it, train a 3D object de…☆13Jun 23, 2022Updated 3 years ago
- Training and inference on AWS Trainium and Inferentia chips.☆264Updated this week
- ☆63Updated this week
- Smart commit messages☆18Oct 25, 2024Updated last year
- ☆13Dec 19, 2025Updated 3 months ago
- ☆18Nov 4, 2024Updated last year
- This repository will soon contain all scripts and links to the annotated corpora of Tibetan.☆14Feb 4, 2025Updated last year
- Comprehensive, scalable ML inference architecture using Amazon EKS, leveraging Graviton processors for cost-effective CPU-based inference…☆21Mar 12, 2026Updated last week
- AutoQASM is an experimental module offering a quantum-imperative programming experience in Python for developing quantum programs.☆22Mar 16, 2026Updated last week
- Repository used to main group ACLs used by Kubeflow developers☆18Mar 18, 2026Updated last week
- ☆12Dec 20, 2025Updated 3 months ago
- Run Haystack Pipelines on Ray☆20Oct 16, 2024Updated last year
- Experimental Managed Delivery plugin to enable deployment of Kubernetes resources via Spinnaker's keel microservice.☆13Apr 11, 2022Updated 3 years ago
- ☆17Apr 9, 2024Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆27Updated this week
- AWS Observability Accelerator☆21Nov 14, 2025Updated 4 months ago
- vLLM performance dashboard☆43Apr 26, 2024Updated last year
- ☆20Apr 24, 2022Updated 3 years ago
- ☆18May 9, 2024Updated last year
- Question Answering Generative AI application with Large Language Models (LLMs) and Amazon OpenSearch Service☆29Dec 3, 2024Updated last year
- ☆27Dec 27, 2023Updated 2 years ago
- This repository aims to showcase how to finetune a FM model in Amazon EKS cluster using, JupyterHub to provision notebooks and craft both…☆52Jun 17, 2025Updated 9 months ago
- ☆14Feb 24, 2023Updated 3 years ago
- Integrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use …☆10May 29, 2024Updated last year
- Autocomp: AI-Driven Code Optimizer for Tensor Accelerators☆89Updated this week
- AI based singing voice synthesis☆37Jun 10, 2024Updated last year
- Backstage plugin for Argo Workflows☆21Oct 3, 2023Updated 2 years ago
- Example code for AWS Neuron SDK developers building inference and training applications☆158Updated this week
- ☆38Feb 16, 2025Updated last year
- Artifact evaluation for HPCA'24 paper Lightening-Transformer: A Dynamically-operated Optically-interconnected Photonic Transformer Accele…☆11Mar 3, 2024Updated 2 years ago
- Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs☆29Feb 9, 2026Updated last month
- ☆73Jun 26, 2024Updated last year
- Create, List, Update, Delete Amazon EKS clusters. Deploy and manage software on EKS. Run distributed model training and inference example…☆66Updated this week
- Supporting material for https://arxiv.org/abs/1907.04769☆12Sep 20, 2021Updated 4 years ago
- ☆32Jan 30, 2026Updated last month
- ☆26Mar 12, 2024Updated 2 years ago
- ZINDI GIZ NLP Agricultural Keyword Spotter 3rd place solution, Audio Classification☆11Sep 8, 2021Updated 4 years ago
- ☆32Jan 12, 2026Updated 2 months ago