SparkJiao / llama-pipeline-parallel
A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you have encoured.
☆45Updated last year
Related projects: ⓘ
- ☆87Updated 4 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆133Updated 3 months ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆72Updated 6 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆101Updated last week
- 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training☆79Updated this week
- [SIGIR'24] The official implementation code of MOELoRA.☆113Updated last month
- LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation☆194Updated 4 months ago
- ☆71Updated 8 months ago
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆90Updated 7 months ago
- Towards Systematic Measurement for Long Text Quality☆27Updated 2 weeks ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆106Updated this week
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆37Updated 6 months ago
- An Experiment on Dynamic NTK Scaling RoPE☆59Updated 9 months ago
- [ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues☆38Updated last month
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆134Updated 2 months ago
- Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment☆60Updated 3 months ago
- Counting-Stars (★)☆70Updated 3 weeks ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆114Updated 2 months ago
- Unofficial implementation of AlpaGasus☆83Updated 11 months ago
- ☆82Updated 5 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆72Updated 4 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆68Updated 11 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆54Updated 6 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆62Updated 7 months ago
- ☆109Updated 5 months ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆128Updated 2 months ago
- ☆32Updated 3 months ago
- Fantastic Data Engineering for Large Language Models☆38Updated last month
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆61Updated 2 months ago
- ☆125Updated this week