thunlp / Modularity-AnalysisLinks
[ACL 2023 Findings] Emergent Modularity in Pre-trained Transformers
β26Updated 2 years ago
Alternatives and similar repositories for Modularity-Analysis
Users that are interested in Modularity-Analysis are comparing it to the libraries listed below
Sorting:
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]β38Updated last year
- [NeurIPS'24] Official code for *π―DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*β119Updated last year
- Long Context Extension and Generalization in LLMsβ62Updated last year
- A curated list of awesome resources dedicated to Scaling Laws for LLMsβ80Updated 2 years ago
- β142Updated last year
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]β110Updated 10 months ago
- β103Updated 2 years ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"β107Updated 3 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":β44Updated last year
- Test-time-training on nearest neighbors for large language modelsβ49Updated last year
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generationβ33Updated 3 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionβ54Updated last year
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"β243Updated 4 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]β77Updated last year
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]β147Updated last year
- GenRM-CoT: Data release for verification rationalesβ68Updated last year
- β39Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervisionβ124Updated last year
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".β63Updated 4 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factualityβ226Updated last year
- Official code for Guiding Language Model Math Reasoning with Planning Tokensβ18Updated last year
- β78Updated last year
- β71Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"β76Updated 7 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningβ119Updated 8 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracyβ76Updated 3 months ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verificationβ73Updated 5 months ago
- One Network, Many Masks: Towards More Parameter-Efficient Transfer Learningβ40Updated 2 years ago
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Accelerationβ61Updated 10 months ago
- [NeurIPS'24 Spotlight] Observational Scaling Lawsβ59Updated last year