☆32Jan 16, 2025Updated last year
Alternatives and similar repositories for QLM
Users that are interested in QLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆12Mar 7, 2024Updated 2 years ago
- ☆23Oct 10, 2025Updated 6 months ago
- A language for video analytics☆12Jan 26, 2023Updated 3 years ago
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆13Apr 17, 2025Updated last year
- An experimental framework for temporal verification based on first-order linear-time temporal logic. Our goal is to express transition sy…☆22Mar 29, 2026Updated last month
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- High-performance GEMM implementation optimized for NVIDIA H100 GPUs, leveraging Hopper architecture's TMA, WGMMA, and Thread Block Cluste…☆10Dec 4, 2024Updated last year
- Tutorial Exercises and Code for GPU Communications Tutorial at HOT Interconnects 2025☆31Oct 22, 2025Updated 6 months ago
- ☆21Jun 9, 2025Updated 10 months ago
- ☆11Mar 15, 2026Updated last month
- ☆89Oct 17, 2025Updated 6 months ago
- [NeurIPS 2025] Official Implementation of ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding.☆53Jan 28, 2026Updated 3 months ago
- Kubernetes operator for local LLM inference with llama.cpp, vLLM, and TGI - multi-GPU, autoscaling, air-gapped, production-ready☆69Updated this week
- ☆20Apr 25, 2026Updated last week
- Code Repository for the NeurIPS 2024 Paper "Toward Efficient Inference for Mixture of Experts".☆19Oct 30, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Differentiable non-uniform interpolation: https://arxiv.org/abs/2012.13257☆11Oct 3, 2021Updated 4 years ago
- ☆60May 4, 2024Updated 2 years ago
- Simulation tool for CDN replication in large low-earth orbit satellite access networks.☆13May 17, 2021Updated 4 years ago
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆18Dec 22, 2023Updated 2 years ago
- ☆18Jan 27, 2025Updated last year
- ☆35Jul 21, 2025Updated 9 months ago
- A Python program that simulates a satellite network using pygame, allowing users to create, configure, and visualize the network state ov…☆11Apr 25, 2023Updated 3 years ago
- LEO Satellite vs. Cellular Networks: Exploring the Potential for Synergistic Integration (CoNEXT '23)☆11Oct 26, 2023Updated 2 years ago
- This is the code for paper "AIHO: Enhancing Task Offloading and Reducing Latency in Serverless Multi-Edge-to-Cloud Systems".☆12Feb 3, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆36Feb 11, 2025Updated last year
- CLI for creating github gists☆14Apr 20, 2017Updated 9 years ago
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆22Jan 25, 2025Updated last year
- Data and code to replicate results from "Single-blind validation of space-based point-source methane emissions detection and quantificati…☆13Mar 3, 2023Updated 3 years ago
- UI for extracting data from pdf files using watsonx prompts☆12Sep 18, 2025Updated 7 months ago
- ☆13Feb 16, 2023Updated 3 years ago
- Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training☆24Mar 1, 2024Updated 2 years ago
- Artifacts Release: A Case for Stateless Mobile Core Network Functions in Space☆16Aug 16, 2022Updated 3 years ago
- Explore Inter-layer Expert Affinity in MoE Model Inference☆16May 6, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Spankchain POC implementation of generalized state channels☆20Feb 27, 2018Updated 8 years ago
- Some tools for DN42 MeowNetwork☆11Apr 3, 2025Updated last year
- ☆81Sep 15, 2025Updated 7 months ago
- A comprehensive and accurate emulation of Bitcoin network implementation☆14Nov 1, 2022Updated 3 years ago
- 2023/12/22 电三 420 每周会议技术分享:「容器」的 slides 和附件☆10Dec 22, 2023Updated 2 years ago
- An open-source ML system course☆36Mar 18, 2025Updated last year
- Astrape: Anonymous Payment Channels with Boring Cryptography (extended version)☆13Apr 9, 2022Updated 4 years ago