SparkJiao / llama-pipeline-parallel
A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you have encoured.
☆55Updated last year
Alternatives and similar repositories for llama-pipeline-parallel:
Users that are interested in llama-pipeline-parallel are comparing it to the libraries listed below
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆76Updated last year
- ☆98Updated 7 months ago
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆131Updated 2 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆154Updated 10 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆149Updated 7 months ago
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆95Updated last year
- Repository of LV-Eval Benchmark☆65Updated 8 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆121Updated 3 months ago
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆62Updated 6 months ago
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆169Updated 10 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆177Updated last month
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆71Updated this week
- [EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs☆249Updated 4 months ago
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models☆76Updated 6 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process☆27Updated 9 months ago
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆40Updated last year
- ☆18Updated 5 months ago
- ☆63Updated 5 months ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆182Updated 6 months ago
- ☆46Updated 10 months ago
- [ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues☆85Updated 9 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆42Updated 6 months ago
- ☆77Updated 2 weeks ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆76Updated 3 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆119Updated 6 months ago
- The official repository of the Omni-MATH benchmark.☆83Updated 4 months ago
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMs☆31Updated 8 months ago
- ☆150Updated 4 months ago
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆49Updated 6 months ago