vLLM plugin for RBLN NPU
☆42Mar 7, 2026Updated this week
Alternatives and similar repositories for vllm-rbln
Users that are interested in vllm-rbln are comparing it to the libraries listed below
Sorting:
- ⚡ A seamless integration of HuggingFace Transformers & Diffusers with RBLN SDK for efficient inference on RBLN NPUs.☆15Feb 27, 2026Updated last week
- ☆27Jan 8, 2024Updated 2 years ago
- RBLN Model Zoo — Compile once. Deploy anywhere.☆30Updated this week
- ☆11Aug 23, 2023Updated 2 years ago
- Repository for the DPP'23 course☆11May 2, 2024Updated last year
- Generic library for neural collapse and several derivative works on the phenomenon.☆18Apr 14, 2025Updated 10 months ago
- Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.☆55Jul 16, 2025Updated 7 months ago
- Parallel Self-Adjusting Computation☆15Jul 5, 2021Updated 4 years ago
- API serving for your diffusers models☆11Jan 19, 2024Updated 2 years ago
- [CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang☆14Jan 5, 2024Updated 2 years ago
- [ICIP 2021] PyTorch code for "The Mind's Eye: Visualizing Class-Agnostic Features of CNNs" for generation of kernel features.☆12Sep 12, 2021Updated 4 years ago
- RKOMORAN is KOMORAN wrapper for R users☆16Feb 9, 2022Updated 4 years ago
- Voyager is a C++ non-blocking network library which can run on Linux, Mac OS X, FreeBSD, etc.☆12Sep 8, 2022Updated 3 years ago
- Website for CSE 234, Winter 2025☆13Mar 24, 2025Updated 11 months ago
- Only in native python & numpy☆11Apr 7, 2018Updated 7 years ago
- ☆13May 11, 2023Updated 2 years ago
- ☆10Apr 24, 2023Updated 2 years ago
- ohyecloudy dotfiles☆14Feb 20, 2026Updated 2 weeks ago
- Contributions to Playwright for .NET 🎭🧪☆12Nov 20, 2023Updated 2 years ago
- ☆11Sep 20, 2024Updated last year
- A Rust crate for easy serving of OpenAI's API with rate limiting and token use tracking out of the box☆12Feb 26, 2021Updated 5 years ago
- AgentBar is a macOS menu bar app that tracks AI coding assistant usage in one place.☆33Feb 22, 2026Updated 2 weeks ago
- This hands-on walks you through fine-tuning an open source LLM on Azure and serving the fine-tuned model on Azure. It is intended for Dat…☆12Jun 23, 2024Updated last year
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆120Mar 6, 2024Updated 2 years ago
- Check the latest items from gumtree and message it☆12Feb 1, 2016Updated 10 years ago
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs☆60Mar 25, 2025Updated 11 months ago
- Hal Daume's hbc☆20Jan 23, 2010Updated 16 years ago
- TransPimLib is a library for transcendental (and other hard-to-calculate) functions in general-purpose PIM systems, TransPimLib provides …☆15Apr 21, 2023Updated 2 years ago
- Komoran 3 in Python☆11Dec 10, 2018Updated 7 years ago
- FMO (Friendli Model Optimizer)☆13Jan 8, 2025Updated last year
- A curated list of blockchain resources for embedded developers☆13Nov 29, 2021Updated 4 years ago
- CentOS docker images, build weekly with latest security updates☆11Updated this week
- Python for Informatics: Exploring Information (Korean)☆29Sep 14, 2015Updated 10 years ago
- A Collection of Parallel Algorithms for Computational Geometry☆12Mar 10, 2022Updated 3 years ago
- reflect's backend - determine intent validity☆12Aug 2, 2024Updated last year
- C++ Library for Quantum State Preparation (QSP)☆12Jan 5, 2023Updated 3 years ago
- 🚀 Launching Bento in a Kubernetes cluster☆17Mar 16, 2025Updated 11 months ago
- A collection of resources for MCP☆12Jan 17, 2020Updated 6 years ago
- 삼각형의 실전! Triton☆16Feb 15, 2024Updated 2 years ago