Preview Code for Continuum Paper
☆67Apr 13, 2026Updated this week
Alternatives and similar repositories for vllm-continuum
Users that are interested in vllm-continuum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆26Apr 7, 2026Updated last week
- a distributed computation platform for running Python and Bash computation tasks on multiple nodes☆12Mar 19, 2025Updated last year
- ☆158Oct 9, 2024Updated last year
- Systematic and comprehensive benchmarks for LLM systems.☆55Jan 28, 2026Updated 2 months ago
- ☆12Apr 9, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICML 2025] Improving Planning of Agents for Long-Horizon Tasks☆30Oct 2, 2025Updated 6 months ago
- The first range filter to simultaneously offer dynamicity, fast operations, and a robust false positive rate for any workload.☆12Jul 15, 2025Updated 9 months ago
- ☆11Jan 19, 2025Updated last year
- An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences☆32Mar 7, 2024Updated 2 years ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆67Oct 2, 2025Updated 6 months ago
- A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.☆183Updated this week
- The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…☆16Dec 11, 2023Updated 2 years ago
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆252Mar 19, 2026Updated last month
- Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.☆59Mar 4, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- VSS: A Storage System for Video Analytics☆13Jul 9, 2021Updated 4 years ago
- The repo of "BugLens"☆39Nov 12, 2025Updated 5 months ago
- ☆17Dec 2, 2025Updated 4 months ago
- ☆11Mar 9, 2022Updated 4 years ago
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆177Feb 11, 2026Updated 2 months ago
- Advancing the frontier of efficient AI☆58Apr 6, 2026Updated last week
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 5 months ago
- ☆20Feb 2, 2026Updated 2 months ago
- A simple SQL parser based on Apache Calcite.☆14Jan 17, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A lightweight tool for detecting bugs on Graph Database Management Systems☆15Jan 9, 2024Updated 2 years ago
- A pytorch model profiler with information about macs, energy and e.t.c☆17Feb 24, 2024Updated 2 years ago
- A rust-version of NVIDIA BlueField DOCA kit.☆14Jun 11, 2023Updated 2 years ago
- ☆19Feb 18, 2025Updated last year
- A collection of GPU experiments and benchmarks for my personal understanding and research.☆28Apr 9, 2026Updated last week
- Datalog Engines OPtimization Tester.☆13Jan 18, 2024Updated 2 years ago
- Nex Venus Communication Library☆73Nov 17, 2025Updated 5 months ago
- Automated High-Performance GPU Kernel Generation☆95Apr 11, 2026Updated last week
- ☆34Jan 12, 2026Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆12Sep 18, 2024Updated last year
- An RDMA skew-aware key-value store, which implements the Scale-Out ccNUMA design, to exploit skew in order to increase performance of dat…☆19Jul 1, 2021Updated 4 years ago
- [ICLR 2025] "GraphRouter: A Graph-based Router for LLM Selections", Tao Feng, Yanzhen Shen, Jiaxuan You☆66Dec 30, 2025Updated 3 months ago
- ☆13Feb 16, 2023Updated 3 years ago
- DS SERVE: The Largest Open Vector Store over Pretain Data; A Framework for Efficient and Scalable Neural Retrieval☆49Jan 28, 2026Updated 2 months ago
- ☆27Jun 22, 2025Updated 9 months ago
- ☆72Oct 10, 2025Updated 6 months ago