GeneZC / MiniMA
Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
☆97Updated 6 months ago
Alternatives and similar repositories for MiniMA:
Users that are interested in MiniMA are comparing it to the libraries listed below
- An Experiment on Dynamic NTK Scaling RoPE☆62Updated last year
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆119Updated this week
- FuseAI Project☆75Updated last month
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆76Updated last year
- ☆93Updated 3 months ago
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆70Updated last year
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆126Updated 2 months ago
- Unofficial implementation of AlpaGasus☆90Updated last year
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.☆48Updated 3 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆97Updated 6 months ago
- Flacuna was developed by fine-tuning Vicuna on Flan-mini, a comprehensive instruction collection encompassing various tasks. Vicuna is al…☆111Updated last year
- Reformatted Alignment☆113Updated 3 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆128Updated 7 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆129Updated 2 months ago
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆125Updated 6 months ago
- Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment☆70Updated 7 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆204Updated 7 months ago
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆89Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆115Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆139Updated 4 months ago
- LongQLoRA: Extent Context Length of LLMs Efficiently☆163Updated last year
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆75Updated 10 months ago
- This is the official repository for Inheritune.☆109Updated 3 months ago
- Reproducible, flexible LLM evaluations☆118Updated last month
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆137Updated 4 months ago
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Updated 10 months ago
- Official implementation of paper "Autonomous Data Selection with Language Models for Mathematical Texts" (As Huggingface Daily Papers: ht…☆79Updated 2 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆71Updated 7 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated 10 months ago
- Leveraging passage embeddings for efficient listwise reranking with large language models.☆35Updated last month