A high-throughput and memory-efficient inference and serving engine for LLMs
☆25Nov 7, 2025Updated 3 months ago
Alternatives and similar repositories for upstreaming-to-vllm
Users that are interested in upstreaming-to-vllm are comparing it to the libraries listed below
Sorting:
- Project showing how to develop NKI kernels for Llama 3.2 1B inference☆21May 29, 2025Updated 9 months ago
- ☆110Jan 16, 2025Updated last year
- 北语 246 实验室新生简明指南☆10May 30, 2022Updated 3 years ago
- ☆22Feb 11, 2026Updated 3 weeks ago
- vLLM performance dashboard☆42Apr 26, 2024Updated last year
- This repository will soon contain all scripts and links to the annotated corpora of Tibetan.☆13Feb 4, 2025Updated last year
- Forex Fair Value Gap Indicator for MT5☆13Dec 11, 2024Updated last year
- Training and inference on AWS Trainium and Inferentia chips.☆261Updated this week
- ☆17Aug 5, 2025Updated 6 months ago
- 어린이를 위한 동화 제작 서비스, My AI Fairy-Tale☆11Apr 7, 2023Updated 2 years ago
- Face Mask Detection using OpenCV and caffee to face detection☆12Jun 28, 2020Updated 5 years ago
- A proof-of-concept implementation of suspend time memory encryption.☆10Feb 26, 2020Updated 6 years ago
- ☆11Feb 19, 2026Updated last week
- decontamination☆26Dec 3, 2025Updated 3 months ago
- Google Cloud の Cloud Run で 架空のWebアプリ Xenn を構築するハンズオン資料です☆12Dec 6, 2024Updated last year
- Tools for controlling full disk encryption☆14Jan 30, 2026Updated last month
- Comprehensive, scalable ML inference architecture using Amazon EKS, leveraging Graviton processors for cost-effective CPU-based inference…☆21Feb 14, 2026Updated 2 weeks ago
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆13Jan 30, 2026Updated last month
- ☆16Jun 6, 2023Updated 2 years ago
- BERT score for text generation☆12Jan 15, 2025Updated last year
- 台灣證交所OpenAPI 的 MCP Server☆40Feb 8, 2026Updated 3 weeks ago
- ☆11Oct 11, 2023Updated 2 years ago
- This repository features Amazon SageMaker Ground Truth and explains how to ingest raw 3D point cloud data, label it, train a 3D object de…☆13Jun 23, 2022Updated 3 years ago
- ZINDI GIZ NLP Agricultural Keyword Spotter 3rd place solution, Audio Classification☆11Sep 8, 2021Updated 4 years ago
- The new software behind openSUSE Paste☆22Oct 2, 2025Updated 5 months ago
- Kubernetes Gateway API implementation in Rust☆23Updated this week
- ☆13Aug 12, 2024Updated last year
- Real-time speech recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, R…☆10Jan 29, 2026Updated last month
- A course on building Large Language Models☆11Mar 24, 2025Updated 11 months ago
- Supporting material for https://arxiv.org/abs/1907.04769☆12Sep 20, 2021Updated 4 years ago
- ☆13Apr 30, 2024Updated last year
- Repository for opt-out requests.☆10Mar 25, 2024Updated last year
- This repo explores how AMR to address tasks difficult for LLMs☆13Jan 15, 2024Updated 2 years ago
- Bert TensorRT模型加速部署☆10Apr 1, 2022Updated 3 years ago
- ☆20Feb 5, 2026Updated 3 weeks ago
- secureblue's static website☆18Updated this week
- Pytorch implementation of [Feudal Net](https://arxiv.org/abs/1703.01161). ([Tensorflow version](https://github.com/dmakian/feudal_networ…☆17Jun 25, 2019Updated 6 years ago
- Deep-Learning-to-find-Superconductors☆12Jan 13, 2021Updated 5 years ago
- Physics-guided Deep Markov Models☆13May 24, 2022Updated 3 years ago