☆104May 11, 2026Updated last week
Alternatives and similar repositories for vllm-mlu
Users that are interested in vllm-mlu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Cloud Native Distributed Nearest Neighbour Search☆15Jun 9, 2020Updated 5 years ago
- This fork of BVLC/Caffe is dedicated to supporting Cambricon deep learning processor and improving performance of this deep learning fram…☆40May 15, 2020Updated 6 years ago
- Implementation of the ERFNet for Real-Time Semantic Segmentation using caffe☆15Sep 11, 2018Updated 7 years ago
- ☆14Nov 6, 2019Updated 6 years ago
- a Tensorflow version of Faster Rcnn for ICPR2018 text detection☆13May 28, 2018Updated 7 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- DeepRec Extension is an easy-to-use, stable and efficient large-scale distributed training system based on DeepRec.☆13May 17, 2024Updated 2 years ago
- ☆11Jan 21, 2021Updated 5 years ago
- Automated deployment of large-scale sharding services on kubernetes.☆15Feb 27, 2024Updated 2 years ago
- SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs☆19May 23, 2024Updated last year
- Build domain AI assistants with annotated dialogue examples - 通过标注对话示例,低成本构建可靠智能体☆30Jan 7, 2026Updated 4 months ago
- Mathematical expression evaluator with just in time code generation.☆12Apr 7, 2013Updated 13 years ago
- [ICLR 2024 Spotlight] 🚀 The official repository of Self-Supervised Learning method "ROPIM", "Pre-training with Random Orthogonal Project…☆10Jan 15, 2025Updated last year
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆12Apr 11, 2024Updated 2 years ago
- TMMA: A Tiled Matrix Multiplication Accelerator for Self-Attention Projections in Transformer Models, optimized for edge deployment on Xi…☆33Apr 7, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆28Nov 11, 2025Updated 6 months ago
- Flight connections map done with D3.js data visualization library.☆12Dec 5, 2019Updated 6 years ago
- ☆11Aug 4, 2020Updated 5 years ago
- Release repo for SONIC and TAILS☆21Jun 12, 2020Updated 5 years ago
- c++ version of ViT☆12Nov 13, 2022Updated 3 years ago
- Kratos: An FPGA Benchmark for Unrolled Deep Neural Networks with Fine-Grained Sparsity and Mixed Precision☆12Jan 19, 2026Updated 4 months ago
- Example of Matrix Multiplication using Map Reduce paradigm in python☆10Oct 25, 2016Updated 9 years ago
- ☆93May 16, 2025Updated last year
- ☆12Mar 31, 2021Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆56May 19, 2025Updated last year
- Code for paper "Spider: Any-to-Many Multimodal LLM"☆15Apr 26, 2025Updated last year
- Chinese Guide for Alveo Getting Started☆12May 18, 2020Updated 6 years ago
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"☆89Mar 17, 2025Updated last year
- ☆12Jun 3, 2019Updated 6 years ago
- [ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"☆74Jan 13, 2026Updated 4 months ago
- Multi Layer Perceptron by Vivado HLS for Xilinx FPGA implementation☆12Dec 26, 2016Updated 9 years ago
- Example of applying CUDA graphs to LLaMA-v2☆11Aug 25, 2023Updated 2 years ago
- Intel RenderKit common C++/CMake infrastructure☆20Nov 13, 2025Updated 6 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A library for constructing allocators and memory pools. It also contains broadly useful abstractions and utilities for memory management.…☆91Updated this week
- C++ Library for Quantum State Preparation (QSP)☆12Jan 5, 2023Updated 3 years ago
- In this programming assignment you will implement a streaming video server and client that communicate control commands via the Real-Time…☆11Dec 29, 2012Updated 13 years ago
- ☆13Jan 7, 2025Updated last year
- Spacemacs configuration layer for elpy☆18Jun 14, 2015Updated 10 years ago
- Using TVM to depoly Transformer on CPU and GPU☆11Aug 25, 2021Updated 4 years ago
- a game framework. warning: wip, dev, unstable, radiation hazard, defcon 3☆24May 10, 2015Updated 11 years ago