duowuyms / OpenCATP-LLMLinks
The official repository of ICCV 2025 paper "CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning".
☆17Updated 3 weeks ago
Alternatives and similar repositories for OpenCATP-LLM
Users that are interested in OpenCATP-LLM are comparing it to the libraries listed below
Sorting:
- ☆42Updated last year
- [ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo☆56Updated 4 months ago
- ☆142Updated last year
- AI model training on heterogeneous, geo-distributed resources☆26Updated 3 weeks ago
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆66Updated last year
- ☆10Updated 2 years ago
- PacketGame: Multi-Stream Packet Gating for Concurrent Video Inference at Scale☆14Updated 2 years ago
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆35Updated 3 months ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆74Updated 2 months ago
- Artifacts for our SIGCOMM'23 paper Ditto☆15Updated 2 years ago
- ☆23Updated last year
- GPU-accelerated LLM Training Simulator☆47Updated 5 months ago
- ☆20Updated last year
- "How to Do Great Research" Course for Ph.D. Students☆131Updated last month
- [TBD] "m4: A Learned Flow-level Network Simulator" by Chenning Li, Anton A. Zabreyko, Om Chabra, Arash Nasr-Esfahany, Kevin Zhao, Pratees…☆15Updated last month
- Enabling High Quality Real-Time Communications with Adaptive Frame-Rate (USENIX NSDI 2023)☆22Updated last year
- Primo: Practical Learning-Augmented Systems with Interpretable Models☆19Updated last year
- ☆46Updated last year
- ☆20Updated 3 years ago
- ☆171Updated last year
- ☆37Updated 2 years ago
- ☆63Updated last year
- [ACL'25] Code for ACL'25 paper "IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory"☆22Updated 10 months ago
- A record of reading list on some MLsys popular topic☆17Updated 9 months ago
- Must-read papers on improving efficiency for LLM serving clusters☆32Updated 6 months ago
- Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …☆49Updated last year
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Updated 2 years ago
- ☆47Updated 7 months ago
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆203Updated last year
- ☆156Updated 5 months ago