WesKwong / FLMMSLinks
Federated Learning Multi-Machine Simulator: A Docker-based federated learning framework for simulating multi-machine training
☆9Updated last year
Alternatives and similar repositories for FLMMS
Users that are interested in FLMMS are comparing it to the libraries listed below
Sorting:
- ☆12Updated 4 months ago
- 此项目是我个人对MIT 6.5940 课程作业的答案,学习笔记和心得。☆14Updated last year
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆75Updated this week
- Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.☆13Updated 3 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆113Updated last week
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [arXiv '25]☆39Updated last month
- A tiny paper rating web☆38Updated 3 months ago
- All-in-one benchmarking platform for evaluating LLM.☆15Updated this week
- ☆92Updated 3 years ago
- This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Act…☆16Updated 8 months ago
- Efficient 2:4 sparse training algorithms and implementations☆54Updated 6 months ago
- ☆52Updated 6 months ago
- ☆20Updated last month
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆252Updated this week
- 📚 Collection of awesome generation acceleration resources.☆275Updated 2 months ago
- USTC-Computer Science-Resources☆45Updated 3 years ago
- XAttention: Block Sparse Attention with Antidiagonal Scoring☆166Updated this week
- [ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models☆16Updated 3 months ago
- ☆84Updated last month
- A Brief Review for Computer Architecture☆19Updated 2 months ago
- A sparse attention kernel supporting mix sparse patterns☆238Updated 4 months ago
- DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting☆15Updated 3 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆73Updated last week
- Code release for VTW (AAAI 2025) Oral☆43Updated 5 months ago
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆79Updated 5 months ago
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆96Updated 7 months ago
- ☆41Updated 6 months ago
- a brief repo about paper research☆15Updated 9 months ago
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification☆22Updated 2 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆41Updated 11 months ago