a fun and educational take on vLLM
☆185Jan 25, 2026Updated 2 months ago
Alternatives and similar repositories for nano-vllm
Users that are interested in nano-vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Educational WIP☆72Feb 16, 2026Updated last month
- PyTorch Lightning based framework to run experiments for self-supervised learning tasks.☆10Feb 14, 2020Updated 6 years ago
- Notes and code for Programming Massively Parallel Processors☆13Mar 29, 2025Updated 11 months ago
- JAX bindings for the flash-attention3 kernels☆21Jan 2, 2026Updated 2 months ago
- MCP server for the X (Twitter) API -- give AI agents the ability to post, search, read, and engage on X☆35Updated this week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Using RAG to generate data for model fine-tuning.☆13Apr 16, 2025Updated 11 months ago
- Structured Generation Evals☆14Sep 25, 2024Updated last year
- Two implementations of ZeRO-1 optimizer sharding in JAX☆14Jun 11, 2023Updated 2 years ago
- Serverless RAG application with LlamaIndex and code interperter on Azure Container Apps☆12Jan 30, 2026Updated last month
- How to quickly serve an LLM using Fast API, Celery, and Redis☆17Aug 29, 2023Updated 2 years ago
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- fmchisel: Efficient Compression and Training Algorithms for Foundation Models☆85Oct 23, 2025Updated 5 months ago
- 中文文档理解多模态语言模型,支持多模态文档信息抽取,文档embedding☆12Jun 26, 2022Updated 3 years ago
- AugmentCode 批量注册账号脚本☆23Aug 19, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Current Alpha version of the ONTO-TRON-5000☆40Dec 1, 2025Updated 3 months ago
- another BibTeX-to-HTML (with Jinja2 templates)☆17Jun 23, 2020Updated 5 years ago
- torchcomms: a modern PyTorch communications API☆351Updated this week
- This is the repo of "RR-Compound: RDMA-Fused gRPC for Low Latency and High Throughput With an Easy Interface" published in TPDS☆27Mar 12, 2025Updated last year
- [ACL 2025] Official implementation of the "CoT-ICL Lab" framework☆11Oct 10, 2025Updated 5 months ago
- Ludic – an LLM-RL library for the era of experience☆62Jan 9, 2026Updated 2 months ago
- A cycle-accurate RISC-V CPU simulator + RTL modeling library in pure Python.☆18Aug 27, 2025Updated 6 months ago
- A small implementation of blockchain protocol.☆13Oct 26, 2017Updated 8 years ago
- Cross-GPU KV Cache Marketplace☆22Nov 12, 2025Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- (Verilog) A simple convolution layer implementation with systolic array structure☆13May 9, 2022Updated 3 years ago
- Programming framework for serverless compute☆15Dec 3, 2019Updated 6 years ago
- A simple and standalone file manager and browser for django projects.☆17Jun 2, 2015Updated 10 years ago
- A repo explaining with an example how to extend the kubernetes default scheduler☆17Jul 11, 2019Updated 6 years ago
- To better understand the ggml library☆28Jun 13, 2025Updated 9 months ago
- 基于FP16的二维脉动阵列电路设计☆13Feb 23, 2023Updated 3 years ago
- ☆19Jan 4, 2024Updated 2 years ago
- A simulation framework for modeling efficiency of Graph Neural Network Dataflows☆23Feb 14, 2025Updated last year
- An HBM FPGA based SpMV Accelerator☆17Aug 29, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Agentic Virtual Lab☆19Nov 30, 2025Updated 3 months ago
- ☆15Feb 27, 2024Updated 2 years ago
- Source code of paper "Machine Learning for Load Balancing in the Linux Kernel"☆24Sep 21, 2020Updated 5 years ago
- GPGPU-SIM 使用篇☆14Nov 12, 2022Updated 3 years ago
- Universal atomic embedding based on crystalTransfomer☆25Jan 11, 2025Updated last year
- Code Pairing Task for Frontend Developer profile for 'Codebrahma'☆10Mar 21, 2023Updated 3 years ago
- gnosis: signifying knowing through observation or experience☆22Jul 19, 2020Updated 5 years ago