☆107May 11, 2026Updated 3 weeks ago
Alternatives and similar repositories for vllm-mlu
Users that are interested in vllm-mlu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This fork of BVLC/Caffe is dedicated to supporting Cambricon deep learning processor and improving performance of this deep learning fram…☆40May 15, 2020Updated 6 years ago
- ☆16Nov 28, 2023Updated 2 years ago
- a Tensorflow version of Faster Rcnn for ICPR2018 text detection☆13May 28, 2018Updated 8 years ago
- ☆11Jan 21, 2021Updated 5 years ago
- SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs☆19May 23, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Mathematical expression evaluator with just in time code generation.☆12Apr 7, 2013Updated 13 years ago
- A user-friendly & efficient knowledge distillation framework for LLMs, supporting off-policy, on-policy (OPD), cross-tokenizer, multimoda…☆186Jun 1, 2026Updated last week
- [ICLR 2024 Spotlight] 🚀 The official repository of Self-Supervised Learning method "ROPIM", "Pre-training with Random Orthogonal Project…☆10Jan 15, 2025Updated last year
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆12Apr 11, 2024Updated 2 years ago
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- Flight connections map done with D3.js data visualization library.☆12Dec 5, 2019Updated 6 years ago
- ☆11Aug 4, 2020Updated 5 years ago
- 📥 🎯 (1,4/4) an MLIR-based toolchain with Vitis HLS LLVM input/output targeting FPGAs.☆15Nov 15, 2022Updated 3 years ago
- c++ version of ViT☆12Nov 13, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Kratos: An FPGA Benchmark for Unrolled Deep Neural Networks with Fine-Grained Sparsity and Mixed Precision☆12Jan 19, 2026Updated 4 months ago
- Example of Matrix Multiplication using Map Reduce paradigm in python☆10Oct 25, 2016Updated 9 years ago
- Binary Neural Network-based COVID-19 Face-Mask Wear and Positioning Predictor on Edge Devices☆12Jul 1, 2021Updated 4 years ago
- ☆12Mar 31, 2021Updated 5 years ago
- Code for paper "Spider: Any-to-Many Multimodal LLM"☆16Apr 26, 2025Updated last year
- Chinese Guide for Alveo Getting Started☆12May 18, 2020Updated 6 years ago
- An optimized Merkle Patricia Trie implementation on GPU, fully compatible with and integrable into Ethereum. The paper is published on VL…☆14Apr 15, 2024Updated 2 years ago
- [DATE'2025, TCAD'2025] Terafly : A Multi-Node FPGA Based Accelerator Design for Efficient Cooperative Inference in LLMs☆37Nov 13, 2025Updated 6 months ago
- ☆12Jun 3, 2019Updated 7 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Example of applying CUDA graphs to LLaMA-v2☆11Aug 25, 2023Updated 2 years ago
- Intel RenderKit common C++/CMake infrastructure☆20Nov 13, 2025Updated 6 months ago
- C++ Library for Quantum State Preparation (QSP)☆12Jan 5, 2023Updated 3 years ago
- In this programming assignment you will implement a streaming video server and client that communicate control commands via the Real-Time…☆11Dec 29, 2012Updated 13 years ago
- ☆13Jan 7, 2025Updated last year
- Python Implementation of Mini DFS☆15Jun 24, 2018Updated 7 years ago
- Using TVM to depoly Transformer on CPU and GPU☆11Aug 25, 2021Updated 4 years ago
- a game framework. warning: wip, dev, unstable, radiation hazard, defcon 3☆24May 10, 2015Updated 11 years ago
- Includes the SVD-based approximation algorithms for compressing deep learning models and the FPGA accelerators exploiting such approximat…☆16Mar 3, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- inference on tvm runtime using c++ with gpu enabled☆10Apr 25, 2018Updated 8 years ago
- ☆12Jan 25, 2023Updated 3 years ago
- ☆15Apr 28, 2023Updated 3 years ago
- GPGPU-SIM 使用篇☆14Nov 12, 2022Updated 3 years ago
- Real-time panorama and image stitching using c++ and openCV CUDA☆12Sep 8, 2021Updated 4 years ago
- Simple starter CMake project that uses NVBench.☆15May 6, 2025Updated last year
- Baidu Hook☆13Jan 7, 2016Updated 10 years ago