PKU-DAIR / Hetu-GalvatronLinks
Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).
☆174Updated 3 weeks ago
Alternatives and similar repositories for Hetu-Galvatron
Users that are interested in Hetu-Galvatron are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models☆72Updated 9 months ago
- [ICML 2024] JSQ: Compressing Large Language Models by Joint Sparsification and Quantization☆145Updated last year
- [ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts☆261Updated last year
- Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models"☆207Updated 2 weeks ago
- Trainable fast and memory-efficient sparse attention☆507Updated last week
- Adaptive Draft-Verification for Efficient Large Language Model Decoding (AAAI 2025 Oral)☆69Updated 9 months ago
- ☆340Updated 2 years ago
- Official Repo for WWW 2025 paper "Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents"☆192Updated 8 months ago
- ☆177Updated 8 months ago
- ☆116Updated 7 months ago
- Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning☆132Updated 7 months ago
- ☆75Updated 2 months ago
- [NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration"