This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers in this list investigate the learning behavior, generalization ability, and other properties of language models through theoretical analysis, empirical analysis, or a combination of both.
☆99Dec 2, 2024Updated last year
Alternatives and similar repositories for awesome-language-model-analysis
Users that are interested in awesome-language-model-analysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [SIGIR'24] Generative Retrieval as Multi-Vector Dense Retrieval☆36Oct 18, 2024Updated last year
- Welcome to the 'In Context Learning Theory' Reading Group☆30Nov 8, 2024Updated last year
- ☆25Feb 20, 2026Updated last month
- [NeurIPS 2025 Spotlight] A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone.☆46Oct 29, 2025Updated 4 months ago
- [ICML 2024] Code release for "On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm"☆11Feb 20, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- a brief repo about paper research☆15Sep 4, 2024Updated last year
- A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...☆395Jan 7, 2026Updated 2 months ago
- DeepRAG: Thinking to Retrieve Step by Step for Large Language Models☆33Feb 17, 2026Updated last month
- An awesome repository & A comprehensive survey on interpretability of LLM attention heads.☆401Mar 2, 2025Updated last year
- A curated list of awesome Deep Learning theories that shed light on the mysteries of DL☆10Jul 20, 2018Updated 7 years ago
- Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for sch…☆207Dec 27, 2024Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- This repo contains papers, books, tutorials and resources on Riemannian optimization.☆57Mar 18, 2026Updated last week
- Codes for the paper The emergence of clusters in self-attention dynamics.☆17Dec 18, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“☆15Jun 13, 2023Updated 2 years ago
- ☆15Sep 21, 2022Updated 3 years ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆29Feb 6, 2026Updated last month
- Trains Sparse Autoencoders based on outputs from language models☆11Oct 7, 2024Updated last year
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆39Nov 1, 2024Updated last year
- A curated list of awesome papers related to generative retrieval models.☆53May 31, 2023Updated 2 years ago
- [ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"☆18Feb 20, 2025Updated last year
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆47Oct 10, 2024Updated last year
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆294Jan 22, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆54May 20, 2024Updated last year
- ☆243May 10, 2024Updated last year
- ☆36Feb 26, 2024Updated 2 years ago
- Enhances Overleaf by allowing article searches and BibTeX retrieval from DBLP and Google Scholar | 通过允许从 DBLP 和 Google Scholar 进行文章搜索和获取 …☆123Feb 3, 2026Updated last month
- [NeurIPS2024] Fast T2T: Optimization Consistency Speeds Up Diffusion-Based Training-to-Testing Solving for Combinatorial Optimization; [N…☆21Jul 2, 2025Updated 8 months ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 10 months ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆17Jun 9, 2025Updated 9 months ago
- Pytorch code for experiments on Linear Transformers☆24Jan 12, 2024Updated 2 years ago
- ☆25Aug 23, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"☆17Feb 27, 2025Updated last year
- awesome papers in LLM interpretability☆612Aug 20, 2025Updated 7 months ago
- How do transformer LMs encode relations?☆56Feb 24, 2024Updated 2 years ago
- ☆19Dec 12, 2023Updated 2 years ago
- [ICCV 2023] Black Box Few-Shot Adaptation for Vision-Language models☆27May 14, 2024Updated last year
- PhysioNet 2019 Challenge: Early Prediction of Sepsis from Clinical Data☆12May 19, 2019Updated 6 years ago
- ☆30Nov 5, 2024Updated last year