ISCS-ZJU / Decentralized-inference-based-on-vLLMLinks
☆30Updated 10 months ago
Alternatives and similar repositories for Decentralized-inference-based-on-vLLM
Users that are interested in Decentralized-inference-based-on-vLLM are comparing it to the libraries listed below
Sorting:
- Source code for CSWAP-CLUSTER'21 and CSWAP+-TPDS'22☆25Updated 3 years ago
- Source code for PHAST-TPDS'22☆29Updated last year
- ☆30Updated 7 months ago
- Source code for XPGraph-MICRO'22☆26Updated 3 years ago
- ☆31Updated 7 months ago
- ☆41Updated 11 months ago
- Source code for GMBE-SC'23☆35Updated 2 years ago
- Source code for AdaMBE-SC'24☆25Updated last year
- Source code for ChunkGraph-ATC'24☆28Updated last year
- Appling the asynchronous tensor swapping to PyTorch framework.☆30Updated 2 years ago
- Source code for AMBEA-TC'24☆29Updated last year
- Source code for CCLBTree-EuroSys'24☆43Updated last year
- Zplot demos☆21Updated 4 years ago
- Source code for NVAlloc-ASPLOS'22☆60Updated 3 years ago
- Source code for iCache-HPCA'23☆50Updated 2 years ago
- ☆15Updated 6 months ago
- ☆10Updated 11 months ago
- for paper @ ASPLOS‘25’☆17Updated 8 months ago
- A persistent learned index for non-volatile memory with high read/write performance.☆21Updated 3 years ago
- Nap - NUMA-Aware Persistent Indexes☆41Updated 4 years ago
- A collection of awesome researchers and papers about disaggregated memory.☆175Updated 2 months ago
- A low-latency, billion-scale, and updatable graph-based vector store on SSD.☆80Updated this week
- ☆14Updated 4 months ago
- WIPE implementation☆13Updated 2 years ago
- GPU-accelerated vector query processing system that supports large vector datasets beyond GPU memory.☆37Updated last year
- ☆17Updated 3 years ago
- ☆75Updated 2 years ago
- ScalaAFA: Constructing User-Space All-Flash Array Engine with Holistic Designs (USENIX ATC 2024).☆11Updated last year
- FlashMob is a shared-memory random walk system.☆32Updated 2 years ago
- Tiered memory management☆84Updated 3 months ago