HFAiLab / hfai-models
HFAI deep learning models
☆148Updated last year
Alternatives and similar repositories for hfai-models:
Users that are interested in hfai-models are comparing it to the libraries listed below
- A flexible and efficient training framework for large-scale alignment tasks☆333Updated last month
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆130Updated 9 months ago
- ☆78Updated last year
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆611Updated last year
- Mixture-of-Experts (MoE) Language Model☆185Updated 6 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆180Updated last month
- FireFlyer Record file format, writer and reader for DL training samples.☆206Updated 2 years ago
- ☆143Updated 2 weeks ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆255Updated 2 months ago
- ☆214Updated last year
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆463Updated 2 weeks ago
- A Telegram bot to recommend arXiv papers☆261Updated last month
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆398Updated last week
- PyTorch bindings for CUTLASS grouped GEMM.☆111Updated 3 months ago
- 🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"☆601Updated 2 weeks ago
- ☆105Updated 4 months ago
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.☆50Updated last month
- The test of different distributed-training methods on High-Flyer AIHPC☆24Updated 2 years ago
- Distributed RL System for LLM Reasoning☆201Updated 3 weeks ago
- Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.☆267Updated 2 years ago
- FlagScale is a large model toolkit based on open-sourced projects.☆257Updated last week
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆161Updated 2 weeks ago
- ☆128Updated 3 weeks ago
- ☆29Updated 7 months ago
- Ring attention implementation with flash attention☆721Updated last month
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training☆170Updated last week
- InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…☆370Updated last week
- The RedStone repository includes code for preparing extensive datasets used in training large language models.☆125Updated last month
- Efficient AI Inference & Serving☆469Updated last year
- DeepSeek Native Sparse Attention pytorch implementation☆54Updated last month