HFAiLab / hfai-models
HFAI deep learning models
☆99Updated last year
Alternatives and similar repositories for hfai-models:
Users that are interested in hfai-models are comparing it to the libraries listed below
- FireFlyer Record file format, writer and reader for DL training samples.☆121Updated 2 years ago
- ☆76Updated last year
- A Telegram bot to recommend arXiv papers☆221Updated last week
- ☆72Updated 5 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆106Updated last week
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆61Updated last year
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆198Updated last week
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆128Updated 7 months ago
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆397Updated 2 months ago
- ATC23 AE☆44Updated last year
- Tutorial for Ray☆17Updated 9 months ago
- ☆95Updated 2 months ago
- A flexible and efficient training framework for large-scale alignment tasks☆276Updated this week
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆276Updated last year
- GPT-Fathom is an open-source and reproducible LLM evaluation suite, benchmarking 10+ leading open-source and closed-source LLMs as well a…☆348Updated 9 months ago
- ☆61Updated 7 months ago
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆403Updated 3 weeks ago
- Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning☆14Updated last week
- The related works and background techniques about Openai o1☆192Updated last week
- ☆92Updated 9 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆64Updated 7 months ago
- ☆50Updated last month
- Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".☆44Updated 6 months ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆88Updated 10 months ago
- AI Alignment: A Comprehensive Survey☆133Updated last year
- ☆317Updated 6 months ago
- A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to…☆53Updated last year
- Low-bit optimizers for PyTorch☆125Updated last year
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆57Updated 7 months ago
- A unified tokenization tool for Images, Chinese and English.☆151Updated last year