crazycth / WizardLearner
Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️
☆32Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for WizardLearner
- Feeling confused about super alignment? Here is a reading list☆43Updated 10 months ago
- 中文大语言模型评测第三期☆24Updated 5 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆25Updated 5 months ago
- ☆78Updated 6 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆127Updated 4 months ago
- ☆37Updated 4 months ago
- Reformatted Alignment☆112Updated last month
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆71Updated last year
- ☆89Updated 7 months ago
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆52Updated last month
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆59Updated last month
- 使用单个24G显卡,从0开始训练LLM☆49Updated 2 weeks ago
- ☆77Updated last month
- Hammer: Robust Function-Calling for On-Device Language Models via Function Masking☆30Updated last month
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆123Updated 2 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆123Updated 3 months ago
- ☆50Updated 3 weeks ago
- ☆119Updated 8 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆96Updated last week
- Official completion of “Training on the Benchmark Is Not All You Need”.☆26Updated last month
- ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆62Updated 6 months ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆52Updated 6 months ago
- ☆118Updated 6 months ago
- Fantastic Data Engineering for Large Language Models☆49Updated 3 months ago
- The official GitHub page for the survey paper "A Survey on Data Augmentation in Large Model Era"☆108Updated 3 months ago
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆190Updated 3 weeks ago
- ☆70Updated 10 months ago
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆27Updated 2 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆111Updated last week
- ☆56Updated 2 weeks ago