☆93Apr 24, 2026Updated this week
Alternatives and similar repositories for vllm-mlu
Users that are interested in vllm-mlu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cloud Native Distributed Nearest Neighbour Search☆15Jun 9, 2020Updated 5 years ago
- FLOPS counter for all your GPU benchmarking needs☆13Aug 8, 2024Updated last year
- ☆13Sep 19, 2023Updated 2 years ago
- This fork of BVLC/Caffe is dedicated to supporting Cambricon deep learning processor and improving performance of this deep learning fram…☆40May 15, 2020Updated 5 years ago
- ☆15Nov 28, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Jan 5, 2021Updated 5 years ago
- ☆11Jan 21, 2021Updated 5 years ago
- [ICLR 2024 Spotlight] 🚀 The official repository of Self-Supervised Learning method "ROPIM", "Pre-training with Random Orthogonal Project…☆10Jan 15, 2025Updated last year
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆12Apr 11, 2024Updated 2 years ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆28Nov 11, 2025Updated 5 months ago
- Flight connections map done with D3.js data visualization library.☆12Dec 5, 2019Updated 6 years ago
- ☆11Aug 4, 2020Updated 5 years ago
- 📥 🎯 (1,4/4) an MLIR-based toolchain with Vitis HLS LLVM input/output targeting FPGAs.☆15Nov 15, 2022Updated 3 years ago
- Release repo for SONIC and TAILS☆21Jun 12, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Kratos: An FPGA Benchmark for Unrolled Deep Neural Networks with Fine-Grained Sparsity and Mixed Precision☆12Jan 19, 2026Updated 3 months ago
- Example of Matrix Multiplication using Map Reduce paradigm in python☆10Oct 25, 2016Updated 9 years ago
- Binary Neural Network-based COVID-19 Face-Mask Wear and Positioning Predictor on Edge Devices☆12Jul 1, 2021Updated 4 years ago
- ☆12Mar 31, 2021Updated 5 years ago
- Includes the SVD-based approximation algorithms for compressing deep learning models and the FPGA accelerators exploiting such approximat…☆16Mar 3, 2023Updated 3 years ago
- ☆29Oct 7, 2024Updated last year
- Chinese Guide for Alveo Getting Started☆12May 18, 2020Updated 5 years ago
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"☆88Mar 17, 2025Updated last year
- An optimized Merkle Patricia Trie implementation on GPU, fully compatible with and integrable into Ethereum. The paper is published on VL…☆14Apr 15, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- [DATE'2025, TCAD'2025] Terafly : A Multi-Node FPGA Based Accelerator Design for Efficient Cooperative Inference in LLMs☆35Nov 13, 2025Updated 5 months ago
- ☆13Jun 3, 2019Updated 6 years ago
- [ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"☆74Jan 13, 2026Updated 3 months ago
- ☆10Jul 18, 2024Updated last year
- ☆18Sep 17, 2015Updated 10 years ago
- Multi Layer Perceptron by Vivado HLS for Xilinx FPGA implementation☆12Dec 26, 2016Updated 9 years ago
- Example of applying CUDA graphs to LLaMA-v2☆11Aug 25, 2023Updated 2 years ago
- ☆13Jan 7, 2025Updated last year
- Python Implementation of Mini DFS☆15Jun 24, 2018Updated 7 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Spacemacs configuration layer for elpy☆18Jun 14, 2015Updated 10 years ago
- Using TVM to depoly Transformer on CPU and GPU☆11Aug 25, 2021Updated 4 years ago
- ☆12Jan 25, 2023Updated 3 years ago
- CUDA C simple application for Nvidia's GPU☆11Jun 7, 2022Updated 3 years ago
- ☆15Apr 28, 2023Updated 3 years ago
- a simple pingpong buffer test☆12Feb 11, 2015Updated 11 years ago
- A simple tool for parsing the profile.json file of mxnet☆14Aug 1, 2018Updated 7 years ago