☆66Feb 4, 2026Updated last month
Alternatives and similar repositories for vllm-mlu
Users that are interested in vllm-mlu are comparing it to the libraries listed below
Sorting:
- Cloud Native Distributed Nearest Neighbour Search☆15Jun 9, 2020Updated 5 years ago
- ☆14Nov 6, 2019Updated 6 years ago
- A one-page WebUI integrating VITS inference, training, and output in Sherpa-Onnx format.☆12Feb 2, 2025Updated last year
- ☆14Mar 29, 2020Updated 5 years ago
- Is it difficult to develop C++ high-concurrency server applications? Come and use XServer☆10Jun 13, 2024Updated last year
- Mathematical expression evaluator with just in time code generation.☆12Apr 7, 2013Updated 12 years ago
- 翻译一些比较好的论文☆16Sep 15, 2016Updated 9 years ago
- TMMA: A Tiled Matrix Multiplication Accelerator for Self-Attention Projections in Transformer Models, optimized for edge deployment on Xi…☆27Mar 24, 2025Updated 11 months ago
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- [ICIP 2021] PyTorch code for "The Mind's Eye: Visualizing Class-Agnostic Features of CNNs" for generation of kernel features.☆12Sep 12, 2021Updated 4 years ago
- Flight connections map done with D3.js data visualization library.☆13Dec 5, 2019Updated 6 years ago
- ☆11Aug 4, 2020Updated 5 years ago
- 📥 🎯 (1,4/4) an MLIR-based toolchain with Vitis HLS LLVM input/output targeting FPGAs.☆14Nov 15, 2022Updated 3 years ago
- Example of Matrix Multiplication using Map Reduce paradigm in python☆10Oct 25, 2016Updated 9 years ago
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"☆81Mar 17, 2025Updated last year
- [DATE'2025, TCAD'2025] Terafly : A Multi-Node FPGA Based Accelerator Design for Efficient Cooperative Inference in LLMs☆29Nov 13, 2025Updated 4 months ago
- Binary Neural Network-based COVID-19 Face-Mask Wear and Positioning Predictor on Edge Devices☆12Jul 1, 2021Updated 4 years ago
- ☆19May 30, 2019Updated 6 years ago
- A curated list of blockchain resources for embedded developers☆13Nov 29, 2021Updated 4 years ago
- Code for paper "Spider: Any-to-Many Multimodal LLM"☆14Apr 26, 2025Updated 10 months ago
- Chinese Guide for Alveo Getting Started☆12May 18, 2020Updated 5 years ago
- ☆12Jun 3, 2019Updated 6 years ago
- Public domain forcefields for viparr☆18Jun 16, 2022Updated 3 years ago
- Example of applying CUDA graphs to LLaMA-v2☆12Aug 25, 2023Updated 2 years ago
- Intel RenderKit common C++/CMake infrastructure☆20Nov 13, 2025Updated 4 months ago
- A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management.…☆88Updated this week
- In this programming assignment you will implement a streaming video server and client that communicate control commands via the Real-Time…☆11Dec 29, 2012Updated 13 years ago
- Python Implementation of Mini DFS☆15Jun 24, 2018Updated 7 years ago
- ☆13Jan 7, 2025Updated last year
- a game framework. warning: wip, dev, unstable, radiation hazard, defcon 3☆24May 10, 2015Updated 10 years ago
- Using TVM to depoly Transformer on CPU and GPU☆11Aug 25, 2021Updated 4 years ago
- inference on tvm runtime using c++ with gpu enabled☆10Apr 25, 2018Updated 7 years ago
- xeCJK使用范例说明解析☆14Feb 27, 2020Updated 6 years ago
- ☆12Jan 25, 2023Updated 3 years ago
- CUDA C simple application for Nvidia's GPU☆11Jun 7, 2022Updated 3 years ago
- ☆15Apr 28, 2023Updated 2 years ago
- a simple pingpong buffer test☆12Feb 11, 2015Updated 11 years ago
- GPGPU-SIM 使用篇☆14Nov 12, 2022Updated 3 years ago
- Real-time panorama and image stitching using c++ and openCV CUDA☆13Sep 8, 2021Updated 4 years ago