This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers in this list investigate the learning behavior, generalization ability, and other properties of language models through theoretical analysis, empirical analysis, or a combination of both.
☆98Dec 2, 2024Updated last year
Alternatives and similar repositories for awesome-language-model-analysis
Users that are interested in awesome-language-model-analysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [SIGIR'24] Generative Retrieval as Multi-Vector Dense Retrieval☆36Oct 18, 2024Updated last year
- Welcome to the 'In Context Learning Theory' Reading Group☆30Nov 8, 2024Updated last year
- [WSDM 2024 Best Paper Honorable Mention] Debiasing Sequential Recommenders through Distributionally Robust Optimization over System Expos…☆15Jun 20, 2024Updated last year
- This is the implementation of paper "Learning to Ask Conversational Questions by Optimizing Levenshtein Distance".☆10Jul 5, 2021Updated 4 years ago
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆23Mar 4, 2025Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆26Feb 20, 2026Updated last month
- [NeurIPS 2025 Spotlight] A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone.☆45Oct 29, 2025Updated 5 months ago
- [ICML 2024] Code release for "On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm"☆11Feb 20, 2025Updated last year
- a brief repo about paper research☆15Sep 4, 2024Updated last year
- A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...☆397Jan 7, 2026Updated 3 months ago
- DeepRAG: Thinking to Retrieve Step by Step for Large Language Models☆36Feb 17, 2026Updated last month
- MAIR: A Massive Benchmark for Evaluating Instructed Retrieval. Evaluate your retrieval models on 126 diverse tasks. [EMNLP 2024]☆24Nov 3, 2024Updated last year
- An awesome repository & A comprehensive survey on interpretability of LLM attention heads.☆404Mar 2, 2025Updated last year
- A curated list of awesome Deep Learning theories that shed light on the mysteries of DL☆10Jul 20, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for sch…☆206Updated this week
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Uncovering Selective State Space Model's Capabilities in Lifelong Sequential Recommendation☆34May 8, 2024Updated last year
- Codes for the paper The emergence of clusters in self-attention dynamics.☆17Dec 18, 2023Updated 2 years ago
- This repo contains papers, books, tutorials and resources on Riemannian optimization.☆58Mar 18, 2026Updated 3 weeks ago
- This repository has been redirected into https://kuaisar.github.io/.☆11Oct 12, 2023Updated 2 years ago
- ☆15Sep 21, 2022Updated 3 years ago
- ☆17Feb 26, 2024Updated 2 years ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆29Feb 6, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆39Nov 1, 2024Updated last year
- A curated list of awesome papers related to generative retrieval models.☆53May 31, 2023Updated 2 years ago
- [ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"☆18Feb 20, 2025Updated last year
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆300Jan 22, 2026Updated 2 months ago
- ☆244May 10, 2024Updated last year
- ☆36Feb 26, 2024Updated 2 years ago
- NeurIPS22 "RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection" and T-PAMI Extension☆20Feb 21, 2025Updated last year
- [NeurIPS2024] Fast T2T: Optimization Consistency Speeds Up Diffusion-Based Training-to-Testing Solving for Combinatorial Optimization; [N…☆21Jul 2, 2025Updated 9 months ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 11 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆17Jun 9, 2025Updated 10 months ago
- IR module for experimaestro☆13Updated this week
- Official implementation of Gelina☆26Mar 27, 2026Updated 2 weeks ago
- [NeurIPS 2023] Code release for "Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity"☆19Oct 19, 2023Updated 2 years ago
- [ICLR'25] Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training☆47Jan 25, 2025Updated last year
- Pytorch code for experiments on Linear Transformers☆24Jan 12, 2024Updated 2 years ago
- ☆25Aug 23, 2024Updated last year