通过实验对比LLM推理中Prefill和Decoding阶段的吞吐量差异,揭示性能瓶颈,解释PD分离优化技术的原理。包含CUDA和Apple MPS (M系列芯片) 的测试脚本。
☆21May 22, 2025Updated 11 months ago
Alternatives and similar repositories for LLM-Prefill-Decode-Benchmark
Users that are interested in LLM-Prefill-Decode-Benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago
- Official PyTorch implementation for the ICML 2023 paper "Out-of-Distribution Generalization of Federated Learning via Implicit Invariant …☆13Oct 31, 2023Updated 2 years ago
- original 8bit CPU of ICF3-Z☆12Feb 20, 2020Updated 6 years ago
- https://bbuf.github.io/gpu-glossary-zh/☆27Nov 7, 2025Updated 5 months ago
- ☆15May 18, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆11Nov 19, 2024Updated last year
- ☆14Feb 12, 2024Updated 2 years ago
- Made a CPU in Logisim when I was 14 (2009), and wrote a naive assembler and compiler for it in Flash. The CPU's design is inspired by Don…☆10Sep 30, 2016Updated 9 years ago
- An Empirical Study of Memorization in NLP (ACL 2022)☆13Jun 22, 2022Updated 3 years ago
- 👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"☆20Jan 19, 2024Updated 2 years ago
- ROCm Driver RDMA Peer to Peer Support☆22Mar 21, 2019Updated 7 years ago
- Code for "Interpreting Word Embeddings with Eigenvector Analysis" https://openreview.net/forum?id=rJfJiR5ooX.☆16Oct 16, 2019Updated 6 years ago
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- Minimal FPGA Processor Core for Stack-based CPU for CPLDs Using Bit-Serial Architecture☆18Sep 6, 2013Updated 12 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆20Apr 18, 2024Updated 2 years ago
- a simple new ISA nnISA and nnSOC nnCPU nnAs nnCc☆10Mar 15, 2020Updated 6 years ago
- 北京理工大学大四小学期计算机组成原理部分☆10Sep 24, 2020Updated 5 years ago
- ☆11Dec 31, 2020Updated 5 years ago
- ☆12Mar 31, 2021Updated 5 years ago
- Convex Formulation of Multiple Instance Learning from Positive and Unlabeled Bags☆10Apr 28, 2018Updated 8 years ago
- Unifew: Unified Fewshot Learning Model☆18Sep 10, 2021Updated 4 years ago
- ☆12Jun 3, 2019Updated 6 years ago
- Some microbenchmarks and design docs before commencement☆11Feb 1, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆17Apr 15, 2025Updated last year
- Repo for "AlphaResearch: Accelerating New Algorithm Discovery with Language Models"☆55Nov 12, 2025Updated 5 months ago
- [AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems☆13May 5, 2025Updated last year
- OK:iCE40Pro, based on iCE40up5k FPGA Core, is an Open Source Educational FPGA GamePad Console.☆15May 6, 2023Updated 2 years ago
- A feature-incomplete peekahole (pahole) clone that doesn't rely on libdwarves (and doesn't choke on Clang output)☆24Oct 23, 2017Updated 8 years ago
- Large-scale Exploration of Neural Relation Classification Architectures☆12Nov 15, 2018Updated 7 years ago
- From Llama to Deepseek, grpo/mtp implemented. With pt/sft/lora/qlora included☆30Apr 21, 2025Updated last year
- CUDA C simple application for Nvidia's GPU☆11Jun 7, 2022Updated 3 years ago
- Code for modeling attention network for distant supervised relation extraction (CoNLL 2019).☆15Feb 28, 2020Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Source code for Findings of EMNLP 2021 paper ``Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement Learning``☆13Nov 9, 2021Updated 4 years ago
- Very simple Cortex-M1 SoC design based on ARM DesignStart☆18Jan 25, 2022Updated 4 years ago
- A dev board based on RaspberryPi RP2040 MCU☆23Oct 27, 2021Updated 4 years ago
- A wishbone controlled FM transmitter hack☆24Jan 16, 2024Updated 2 years ago
- Deep Learning model to tackle the Fake News Challenge☆13Nov 6, 2018Updated 7 years ago
- 🔥Keywords and URLs Censored on the Chinese Internet☆13Feb 22, 2020Updated 6 years ago
- A list of research resources that I've appreciated.☆12Dec 10, 2019Updated 6 years ago