tyler-griggs / melange-releaseView external linksLinks
☆47Jun 27, 2024Updated last year
Alternatives and similar repositories for melange-release
Users that are interested in melange-release are comparing it to the libraries listed below
Sorting:
- ☆13Feb 22, 2023Updated 2 years ago
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆36Aug 29, 2025Updated 5 months ago
- ☆12Oct 16, 2022Updated 3 years ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆135Feb 22, 2024Updated last year
- LLM Serving Performance Evaluation Harness☆83Feb 25, 2025Updated 11 months ago
- ☆66Nov 4, 2024Updated last year
- Stateful LLM Serving☆95Mar 11, 2025Updated 11 months ago
- A universal workflow system for exactly-once DAGs☆23Jun 1, 2023Updated 2 years ago
- A benchmark suite for evaluating FaaS scheduler.☆23Nov 5, 2022Updated 3 years ago
- ☆151Oct 9, 2024Updated last year
- Using fourier interpolation to merge large language models☆11Jan 6, 2026Updated last month
- ☆15May 2, 2023Updated 2 years ago
- Code for our ICLR Trustworthy ML 2020 workshop paper "Improved Image Wasserstein Attacks and Defenses"☆14Apr 28, 2020Updated 5 years ago
- A throughput-oriented high-performance serving framework for LLMs☆945Oct 29, 2025Updated 3 months ago
- Disaggregated serving system for Large Language Models (LLMs).☆776Apr 6, 2025Updated 10 months ago
- CausIL is an approach to estimate the causal graph for a cloud microservice system, where the nodes are the service-specific metrics whil…☆13Jul 3, 2023Updated 2 years ago
- A low-latency & high-throughput serving engine for LLMs☆474Jan 8, 2026Updated last month
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆142Dec 4, 2024Updated last year
- High-Speed Stateful Packet Processor for Programmable Switches☆14Dec 18, 2022Updated 3 years ago
- An Attention Superoptimizer☆22Jan 20, 2025Updated last year
- Efficient and easy multi-instance LLM serving☆527Sep 3, 2025Updated 5 months ago
- An Open-Source SCAlable Interface for ISA Extensionsfor RISC-V Processors. New Version:☆17Feb 29, 2024Updated last year
- ☆20Jun 9, 2025Updated 8 months ago
- ☆20May 14, 2025Updated 9 months ago
- Artifacts for our SIGCOMM'23 paper Ditto☆15Oct 17, 2023Updated 2 years ago
- ☆17May 10, 2024Updated last year
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆209Sep 21, 2024Updated last year
- Backdraft: a Lossless Virtual Switch that Prevents the Slow Receiver Problem. USENIX NSDI 2022☆15Mar 1, 2023Updated 2 years ago
- This is the course project for CSCE585: ML Systems. Students will build their machine learning systems based on the provided infrastructu…☆12Dec 15, 2020Updated 5 years ago
- ddl-benchmarks: Benchmarks for Distributed Deep Learning☆36May 29, 2020Updated 5 years ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆34May 6, 2024Updated last year
- ☆18Jan 10, 2023Updated 3 years ago
- A parallelism VAE avoids OOM for high resolution image generation☆85Aug 4, 2025Updated 6 months ago
- Example code to create and train a Pytorch model using the new C++ frontend.☆17Mar 19, 2019Updated 6 years ago
- ☆19May 15, 2024Updated last year
- This repository stores the source code for the Mistral Hackathon 2024 in Paris☆16Aug 23, 2024Updated last year
- Dynamic Context Selection for Efficient Long-Context LLMs☆56May 20, 2025Updated 8 months ago
- PyTorch library for cost-effective, fast and easy serving of MoE models.☆280Feb 2, 2026Updated 2 weeks ago
- ☆131Nov 11, 2024Updated last year