ShiZhengyan / InstructionModellingLinks
[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"
โ39Updated last year
Alternatives and similar repositories for InstructionModelling
Users that are interested in InstructionModelling are comparing it to the libraries listed below
Sorting:
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Modelsโ55Updated 7 months ago
- [๐๐๐๐๐ ๐ ๐ข๐ง๐๐ข๐ง๐ ๐ฌ ๐๐๐๐ & ๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐ซ๐๐ฅ] ๐๐ฏ๐ฉ๐ข๐ฏ๐ค๐ช๐ฏ๐จ ๐๐ข๐ต๐ฉ๐ฆ๐ฎ๐ข๐ต๐ช๐ค๐ข๐ญ ๐๐ฆ๐ข๐ด๐ฐ๐ฏ๐ช๐ฏโฆโ52Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"โ75Updated 4 months ago
- โ86Updated 8 months ago
- โ65Updated last year
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimizationโ40Updated 7 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]โ108Updated 7 months ago
- Codebase for Instruction Following without Instruction Tuningโ35Updated last year
- Long Context Extension and Generalization in LLMsโ60Updated last year
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curationโ64Updated 5 months ago
- โ127Updated 6 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]โ172Updated 3 months ago
- Exploration of automated dataset selection approaches at large scales.โ47Updated 7 months ago
- Large Language Models Can Self-Improve in Long-context Reasoningโ73Updated 10 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)โ148Updated last year
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)โ62Updated last year
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"โ100Updated 2 months ago
- [NeurIPS-2024] ๐ Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623โ86Updated last year
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimizaโฆโ20Updated 10 months ago
- โ96Updated last month
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":โ38Updated last year
- โ59Updated last year
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]โ35Updated last year
- [ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"โ77Updated 10 months ago
- โ132Updated 3 weeks ago
- โ30Updated 2 years ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Modelsโ78Updated last year
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]โ148Updated 11 months ago
- "Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwโฆโ30Updated last year
- Code for paper "Patch-Level Training for Large Language Models"โ88Updated 10 months ago