Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
☆102Jul 9, 2024Updated last year
Alternatives and similar repositories for MiniMA
Users that are interested in MiniMA are comparing it to the libraries listed below
Sorting:
- Code for ACL 2023 paper titled "Lifting the Curse of Capacity Gap in Distilling Language Models"☆30Jul 14, 2023Updated 2 years ago
- **ASCM4ABSA** - Our code and proposed data for NLPCC 2022 paper titled "Aspect-specific Context Modeling for Aspect-based Sentiment Analy…☆12Mar 26, 2023Updated 2 years ago
- Code and data for COLING 2022 paper titled "Structural Bias For Aspect Sentiment Triplet Extraction"☆26May 28, 2023Updated 2 years ago
- The code and preprocessed data for ACL 2021 paper titled "Exploiting Position Bias for Robust Aspect Sentiment Classification"☆27Aug 5, 2021Updated 4 years ago
- Code and dataset for paper "End-to-end Emotion-Cause Pair Extraction via Learning to Link"☆16Jan 12, 2022Updated 4 years ago
- Official repo for EMNLP 2023 paper "Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations…☆29Dec 5, 2023Updated 2 years ago
- Code for SIGIR 2019 paper titled "Syntax-Aware Aspect-Level Sentiment Classification with Proximity-Weighted Convolution Network"☆25Nov 21, 2023Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- Code of the COLING22 paper "uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers"☆19Aug 17, 2022Updated 3 years ago
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆60May 28, 2024Updated last year
- Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.☆29Mar 11, 2025Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆34Aug 14, 2024Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆37Jan 3, 2024Updated 2 years ago
- Unofficial implementation of AlpaGasus☆95Sep 23, 2023Updated 2 years ago
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 7 months ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- Simple Model Similarities Analysis☆21Feb 3, 2024Updated 2 years ago
- distill large scale web page text☆12Jul 29, 2023Updated 2 years ago
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Mar 20, 2024Updated 2 years ago
- A trainable user simulator☆34Jun 30, 2025Updated 8 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆79Jun 17, 2024Updated last year
- ☆36Oct 10, 2024Updated last year
- The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction☆390Jul 9, 2024Updated last year
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆194Mar 25, 2024Updated last year
- codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification☆19Dec 10, 2021Updated 4 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 3 years ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆126May 7, 2024Updated last year
- ☆99Jun 27, 2024Updated last year
- MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.☆30Jul 9, 2025Updated 8 months ago
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- ☆20Nov 3, 2024Updated last year
- OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…☆25May 10, 2024Updated last year
- Count Tokens of Code (forked from gocloc)☆44Aug 19, 2024Updated last year
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆86Dec 14, 2023Updated 2 years ago
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆448Oct 16, 2024Updated last year
- This repository lists papers, codes, and datasets in Biomedical Text Summarisation based on PLM☆23Oct 4, 2022Updated 3 years ago
- 🐜🔧 A minimalistic tool to fine-tune your LLMs☆18Aug 17, 2023Updated 2 years ago