qingkelab / qingketalkLinks
青稞Talk
☆68Updated this week
Alternatives and similar repositories for qingketalk
Users that are interested in qingketalk are comparing it to the libraries listed below
Sorting:
- siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems☆160Updated this week
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training☆222Updated last week
- ☆389Updated this week
- qwen-nsa☆71Updated 4 months ago
- Efficient Mixture of Experts for LLM Paper List☆90Updated 8 months ago
- mllm-npu: training multimodal large language models on Ascend NPUs☆91Updated 11 months ago
- Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".☆48Updated last year
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆346Updated 2 weeks ago
- ☆41Updated 2 months ago
- Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference☆130Updated 2 months ago
- ☆114Updated 2 months ago
- A sparse attention kernel supporting mix sparse patterns☆269Updated 6 months ago
- [ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring☆218Updated last month
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆43Updated last month
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆134Updated last week
- ☆78Updated 3 months ago
- ☆145Updated 5 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆136Updated last year
- SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs☆145Updated last week
- DeepSeek Native Sparse Attention pytorch implementation☆88Updated last week
- ☆123Updated 2 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆138Updated 4 months ago
- Implementation for FP8/INT8 Rollout for RL training without performence drop.☆83Updated this week
- 16-fold memory access reduction with nearly no loss☆104Updated 4 months ago
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification☆23Updated 4 months ago
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆234Updated last month
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆105Updated 4 months ago
- ☆141Updated last month
- ☆43Updated last year
- ☆92Updated 4 months ago