Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
☆19May 28, 2024Updated 2 years ago
Alternatives and similar repositories for FineInfer
Users that are interested in FineInfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆27Aug 25, 2023Updated 2 years ago
- Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.☆14Oct 3, 2022Updated 3 years ago
- Deft: A Scalable Tree Index for Disaggregated Memory☆22Apr 23, 2025Updated last year
- ☆12Apr 23, 2026Updated last month
- Federated Few-shot Learning for Mobile NLP. Conditionally accepted by MobiCom'23.