PKU-DAIR / Hetu-GalvatronLinks
Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).
☆176Updated last week
Alternatives and similar repositories for Hetu-Galvatron
Users that are interested in Hetu-Galvatron are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models☆72Updated 9 months ago
- [ICML 2024] JSQ: Compressing Large Language Models by Joint Sparsification and Quantization☆145Updated last year
- Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models"☆210Updated last month
- Adaptive Draft-Verification for Efficient Large Language Model Decoding (AAAI 2025 Oral)☆71Updated 9 months ago
- Trainable fast and memory-efficient sparse attention☆516Updated last week
- [ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts☆263Updated last year
- Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning☆129Updated 7 months ago
- Official repo for 'Large Multimodal Models Evaluation: A Survey'☆97Updated last month
- ☆117Updated 7 months ago
- ☆176Updated 9 months ago
- Official Repo for WWW 2025 paper "Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents"☆192Updated 9 months ago
- Alchemy Cat —— 🔥Config System for SOTA☆111Updated last month
- ☆341Updated 2 years ago
- [ICLR 2025] Tool-Planner: Task Planning with Clusters across Multiple Tools☆115Updated 8 months ago
- Python-based Electronic-Photonic Integrated System Architecture Modeling and Evaluation Framework (DAC 2025)☆136Updated last week
- [NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration"☆200Updated 10 months ago
- ai_developer is an AI-driven software engineer that turns a single-line requirement into a fully functional project.☆58Updated 11 months ago
- ARIES (ArXiv Research Intelligent Efficient Summary)☆73Updated last year
- NeoBERT is an advanced model designed specifically for predicting the binding affinity between neoantigens and HLA. It is a variant of th…☆156Updated last year
- EasyDeploy is engineered to provide users with end-to-end deployment capabilities for large-scale models.☆132Updated 9 months ago
- Navigating Model Phase Transitions to Enable Extreme Lossless Compression: A Perspective☆75Updated 5 months ago
- ☆194Updated last year
- Official code for paper "Learning to Use Tools via Cooperative and Interactive Agents"☆233Updated last year
- Logic for application☆31Updated last year
- 一款基于 Typecho 默认主题 Replica 开发的博客主题,旨在简约现代的基础上提升阅读体验。☆117Updated this week
- Official Repository of paper MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Pol…☆80Updated 3 months ago
- a theme for typora☆43Updated 10 months ago
- a Simple and elegant theme for gridea (inspire from hexo-theme-matery).☆61Updated 10 months ago
- Official Repo for AAAI 2024 paper "Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum"☆158Updated 10 months ago
- FastSFile 是一个基于 Python 的命令行文件传输工具,方便在手机或其他设备上轻松下载文件。☆44Updated 11 months ago