This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers in this list investigate the learning behavior, generalization ability, and other properties of language models through theoretical analysis, empirical analysis, or a combination of both.
☆100Dec 2, 2024Updated last year
Alternatives and similar repositories for awesome-language-model-analysis
Users that are interested in awesome-language-model-analysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [SIGIR'24] Generative Retrieval as Multi-Vector Dense Retrieval☆36Oct 18, 2024Updated last year
- Welcome to the 'In Context Learning Theory' Reading Group☆31Nov 8, 2024Updated last year
- [WSDM 2024 Best Paper Honorable Mention] Debiasing Sequential Recommenders through Distributionally Robust Optimization over System Expos…☆16Jun 20, 2024Updated last year
- This is the implementation of paper "Learning to Ask Conversational Questions by Optimizing Levenshtein Distance".☆10Jul 5, 2021Updated 4 years ago
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆23Mar 4, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆26Feb 20, 2026Updated 3 months ago
- [NeurIPS 2025 Spotlight] A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone.☆47Oct 29, 2025Updated 6 months ago
- [ICML 2024] Code release for "On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm"☆11Feb 20, 2025Updated last year
- a brief repo about paper research☆15Sep 4, 2024Updated last year
- A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...☆401May 19, 2026Updated last week
- DeepRAG: Thinking to Retrieve Step by Step for Large Language Models☆39Feb 17, 2026Updated 3 months ago
- A curated list of awesome Deep Learning theories that shed light on the mysteries of DL☆10Jul 20, 2018Updated 7 years ago
- Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for sch…☆211Apr 13, 2026Updated last month
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Uncovering Selective State Space Model's Capabilities in Lifelong Sequential Recommendation☆35May 8, 2024Updated 2 years ago
- Codes for the paper The emergence of clusters in self-attention dynamics.☆18Dec 18, 2023Updated 2 years ago
- This repo contains papers, books, tutorials and resources on Riemannian optimization.☆60Mar 18, 2026Updated 2 months ago
- [TOIS 2023] On the User Behavior Leakage from Recommender System Exposure☆19Nov 7, 2023Updated 2 years ago
- Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“☆15Jun 13, 2023Updated 2 years ago
- ☆17Feb 26, 2024Updated 2 years ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆30Feb 6, 2026Updated 3 months ago
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆39Nov 1, 2024Updated last year
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆49Oct 10, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆306Jan 22, 2026Updated 4 months ago
- ☆54May 20, 2024Updated 2 years ago
- ☆245May 10, 2024Updated 2 years ago
- NeurIPS22 "RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection" and T-PAMI Extension☆20Feb 21, 2025Updated last year
- [NeurIPS2024] Fast T2T: Optimization Consistency Speeds Up Diffusion-Based Training-to-Testing Solving for Combinatorial Optimization; [N…☆21Jul 2, 2025Updated 10 months ago
- Enhances Overleaf by allowing article searches and BibTeX retrieval from DBLP and Google Scholar | 通过允许从 DBLP 和 Google Scholar 进行文章搜索和获取 …☆127Feb 3, 2026Updated 3 months ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated last year
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆17Jun 9, 2025Updated 11 months ago
- ☆17Jun 14, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆22Jun 11, 2024Updated last year
- Pytorch code for experiments on Linear Transformers☆24Jan 12, 2024Updated 2 years ago
- [ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"☆16Feb 27, 2025Updated last year
- ☆26Aug 23, 2024Updated last year
- [ICLR'25] Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training☆48Jan 25, 2025Updated last year
- awesome papers in LLM interpretability☆619Aug 20, 2025Updated 9 months ago
- How do transformer LMs encode relations?☆57Feb 24, 2024Updated 2 years ago