☆26Mar 14, 2024Updated 2 years ago
Alternatives and similar repositories for specinfer-ae
Users that are interested in specinfer-ae are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-Candidate Speculative Decoding☆40Apr 22, 2024Updated last year
- ☆15Aug 19, 2024Updated last year
- ☆35Nov 28, 2024Updated last year
- ☆28May 24, 2025Updated 10 months ago
- ☆14Jun 4, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Fast inference from large lauguage models via speculative decoding☆904Aug 22, 2024Updated last year
- ☆19May 10, 2025Updated 10 months ago
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation☆32Nov 16, 2024Updated last year
- ☆17Feb 13, 2021Updated 5 years ago
- ☆46Nov 10, 2023Updated 2 years ago
- [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling☆54Jul 15, 2025Updated 8 months ago
- ☆15Apr 11, 2024Updated last year
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆65Feb 21, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 南京大学问题求解(四)课程项目“Overcooked”☆15Aug 3, 2024Updated last year
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆19Dec 8, 2023Updated 2 years ago
- ☆13Jan 28, 2026Updated 2 months ago
- ☆10Sep 2, 2023Updated 2 years ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆376Apr 22, 2025Updated 11 months ago
- DLL注入工具☆12Nov 9, 2020Updated 5 years ago
- [ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.☆27Apr 21, 2025Updated 11 months ago
- Custom Python Scheduler for Kubernetes☆15Jan 25, 2020Updated 6 years ago
- 分享收集的在算法竞赛、数据结构方面的课件、论文、书籍、OJ网站、习题。☆14May 21, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆37Jan 20, 2022Updated 4 years ago
- Google DeepMind: Mixture of Depths Unofficial Implementation.☆12May 29, 2024Updated last year
- Neural Network Quantization With Fractional Bit-widths☆11Feb 19, 2021Updated 5 years ago
- [TVLSI 2025] ACiM Inference Simulation Framework in "ASiM: Modeling and Analyzing Inference Accuracy of SRAM-Based Analog CiM Circuits"☆27Sep 9, 2025Updated 6 months ago
- ☆10Sep 26, 2024Updated last year
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆221Feb 13, 2025Updated last year
- ☆20Dec 24, 2024Updated last year
- Eyeriss chip simulator☆39Mar 6, 2020Updated 6 years ago
- Simulation, multi-path estimation, and CBR parsing code of SIGCOMM2023 BeamSense CBR-Sensing☆10Jan 14, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆35May 6, 2024Updated last year
- ☆15Jun 26, 2024Updated last year
- Llama causal LM fully recreated in LibTorch. Designed to be used in Unreal Engine 5☆16Sep 19, 2024Updated last year
- Unreal Engine 5 3D Platformer game prototype☆17May 27, 2024Updated last year
- EECS 151/251A FPGA Project Skeleton for Spring 2020☆12May 6, 2020Updated 5 years ago
- ☆16Nov 14, 2022Updated 3 years ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆76Jul 14, 2025Updated 8 months ago