SparkJiao / llama-pipeline-parallelLinks
A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you have encoured.
☆56Updated 2 years ago
Alternatives and similar repositories for llama-pipeline-parallel
Users that are interested in llama-pipeline-parallel are comparing it to the libraries listed below
Sorting:
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆78Updated last year
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆162Updated last year
- [EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs☆254Updated 9 months ago
- ☆104Updated 2 months ago
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆168Updated 7 months ago
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆98Updated last year
- [ACL 2025, Main Conference, Oral] Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process☆30Updated last year
- ☆114Updated last year
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆187Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆174Updated 2 months ago
- Counting-Stars (★)☆83Updated 3 months ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆184Updated 11 months ago
- [ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues☆112Updated last year
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆225Updated this week
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆123Updated 8 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆266Updated last year
- Rectified Rotary Position Embeddings☆381Updated last year
- ☆104Updated 2 years ago
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆224Updated last year
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆272Updated 2 years ago
- Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆149Updated 6 months ago
- A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks☆266Updated last year
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models☆76Updated 11 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆129Updated 10 months ago
- Repository of LV-Eval Benchmark☆70Updated last year
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆83Updated last year
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆345Updated 11 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆64Updated 10 months ago
- Unofficial implementation of AlpaGasus☆92Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year