duowuyms / OpenCATP-LLMLinks
The official repository of ICCV 2025 paper "CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning".
☆17Updated last month
Alternatives and similar repositories for OpenCATP-LLM
Users that are interested in OpenCATP-LLM are comparing it to the libraries listed below
Sorting:
- [ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo☆57Updated 5 months ago
- PacketGame: Multi-Stream Packet Gating for Concurrent Video Inference at Scale☆14Updated 2 years ago
- ☆145Updated last year
- Artifacts for our SIGCOMM'23 paper Ditto☆15Updated 2 years ago
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆66Updated last year
- ☆43Updated last year
- ☆10Updated 2 years ago
- [ACL'25] Code for ACL'25 paper "IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory"☆24Updated 10 months ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆75Updated 2 months ago
- ☆20Updated 3 years ago
- Enabling High Quality Real-Time Communications with Adaptive Frame-Rate (USENIX NSDI 2023)☆23Updated 2 years ago
- ☆21Updated last year
- ☆64Updated last year
- ☆47Updated last year
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆36Updated 4 months ago
- ☆16Updated last year
- GPU-accelerated LLM Training Simulator☆47Updated 6 months ago
- Official implementation of MASS: Multi-Agent Simulation Scaling for Portfolio Construction☆157Updated last month
- ☆54Updated 3 months ago
- Primo: Practical Learning-Augmented Systems with Interpretable Models☆19Updated 2 years ago
- AI model training on heterogeneous, geo-distributed resources☆32Updated last month
- Burstable Cloud Scheduler☆16Updated last year
- "How to Do Great Research" Course for Ph.D. Students☆133Updated 2 months ago
- ☆102Updated last year
- Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …☆48Updated last year
- 🔥🔥🔥 Latest works on video streaming/processing/analysis☆111Updated 2 years ago
- A record of reading list on some MLsys popular topic☆17Updated 9 months ago
- PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".☆93Updated 2 years ago
- Survey on LLM Inference via Search (TMLR 2025)☆14Updated 8 months ago
- ☆26Updated 2 years ago