Lightning-fast LLM inference engine - Built with Rust (inspiration from https://github.com/GeeeekExplorer/nano-vllm)
☆36Jun 24, 2025Updated 11 months ago
Alternatives and similar repositories for nano-vllm-rs
Users that are interested in nano-vllm-rs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR2026] The first W4A4KV4 quantized + 50% sparse LLMs!☆32Jan 26, 2026Updated 4 months ago
- OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection(ICCAD 2024)☆32Oct 20, 2024Updated last year
- [MobiCom 24] Adaptive DNN inference under memory constraints☆57Jan 22, 2025Updated last year
- 🦙🦙.🦀☆28Sep 24, 2023Updated 2 years ago
- To better understand the ggml library☆27Jun 13, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A concise Hindley-Milner type inferencer (algorithm W) implemented with Scala☆16May 13, 2013Updated 13 years ago
- ECE408 (Applied Parallel Programming) Fall 2022 MP☆21Mar 24, 2023Updated 3 years ago
- A set of tools that make working with the Scala ecosystem even better.☆13Jun 13, 2026Updated last week
- Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)☆18May 23, 2024Updated 2 years ago
- Spark—Python学习笔记☆11Sep 25, 2018Updated 7 years ago
- Basic paxos☆29Apr 10, 2014Updated 12 years ago
- 简单的字典翻译组件☆10Mar 18, 2024Updated 2 years ago
- Simple, safe way to store and distribute tensors☆15Jun 9, 2026Updated last week
- Nearest neighbor search for Ruby and S3 Vectors☆14Apr 9, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Postgres protocol support for finagle☆36Sep 4, 2013Updated 12 years ago
- zero cost Apply/Applicative syntax☆13Jun 14, 2026Updated last week
- Hands-On Scala Programming [Video], published by Packt☆13Oct 31, 2022Updated 3 years ago
- Official Java Client for Ipregistry, a Fast, Reliable IP Geolocation and Threat Data API.☆16Dec 7, 2025Updated 6 months ago
- SmartBuf is a cross-language serialization and deserialization framework, and it has high performance and compression ratio like …☆11Dec 5, 2023Updated 2 years ago
- OpenAI compatible API for open source LLMs☆17Oct 30, 2023Updated 2 years ago
- Composable data loading for Ruby☆13Apr 8, 2026Updated 2 months ago
- Write events for TensorBoard☆13Apr 27, 2026Updated last month
- ゼロから作るDeep Learning ❸ をC++で実装する。自習用リポジトリ。☆17Aug 12, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The final project of PKU course Compilers: Principles in Spring 2023, a SysY to RISC-V compiler. Document::https://pku-minic.github.io/on…☆21Jun 15, 2023Updated 3 years ago
- Code from Learn You A Haskell book: http://learnyouahaskell.com/☆17Aug 19, 2012Updated 13 years ago
- VecoLuc is a scalable vector search engine that leverages Apache Lucene and the JDK's incubator vector API for high-performance vector op…☆11Aug 22, 2024Updated last year
- sh - Super fast Alfred 3+ workflow to search through Chrome history 🕵️♀️☆13Nov 28, 2020Updated 5 years ago
- modern compiler implementation in c☆14Oct 1, 2015Updated 10 years ago
- human in the loop in dify workflow by plugin☆16Jan 7, 2025Updated last year
- Simple Rust binding for the JNI (Java Native Interface) API.☆12Updated this week
- Use Rust to call Paddle OCR models via ONNX Runtime for image text recognition.☆79Mar 3, 2026Updated 3 months ago
- special node-red node used in Distrbuted Node-RED☆16May 14, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 本实例采用sys_id + organization_id来标识一个租户,重写了mybatis-plus中的tenant_id的租户类型☆13Mar 3, 2020Updated 6 years ago
- Liga: Let Data Dance with ML Models☆13Sep 12, 2023Updated 2 years ago
- http server to take screenshots of websites☆15Sep 13, 2012Updated 13 years ago
- Tensor-based Spectral LDA on Spark☆18Jun 5, 2018Updated 8 years ago
- ☆11May 21, 2021Updated 5 years ago
- IANA IPv4 and IPv6 number database☆17Jul 16, 2021Updated 4 years ago
- Simple HTTP serving for PyTorch 🚀☆10Oct 15, 2020Updated 5 years ago