WeiHuang05 / Awesome_Large_Foundation_Model_TheoryLinks

Welcome to the 'In Context Learning Theory' Reading Group

☆30

Alternatives and similar repositories for Awesome_Large_Foundation_Model_Theory

Users that are interested in Awesome_Large_Foundation_Model_Theory are comparing it to the libraries listed below

Sorting:

WeiHuang05 / Awesome-Feature-Learning-in-Deep-Learning-Thoery
Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for sch…
☆200Updated 10 months ago
zzp1012 / SAM-in-Late-Phase
[ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"
☆15Updated 8 months ago
gortizji / tangent_task_arithmetic
Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".
☆105Updated 2 years ago
ZO-Bench / ZO-LLM
[ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".
☆111Updated 3 months ago
nik-dim / tall_masks
Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]
☆51Updated last year
tmlr-group / G-effect
[ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"
☆12Updated 8 months ago
MinghuiChen43 / awesome-deep-phenomena
A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...
☆376Updated this week
ycjing / Awesome-Model-Merging
A curated list of Model Merging methods.
☆92Updated last year
hanyang1999 / discrete-diffusion-papers
A collection of papers on discrete diffusion models
☆165Updated 4 months ago
Trustworthy-ML-Lab / ThinkEdit
[EMNLP 25] An effective and interpretable weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study un…
☆15Updated last month
kyrie-23 / linear_task_arithmetic
☆11Updated 3 months ago
YefanZhou / TempBalance
[NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
☆35Updated 6 months ago
Furyton / awesome-language-model-analysis
This paper list focuses on the theoretical and empirical analysis of language models, especially large language models (LLMs). The papers…
☆92Updated 10 months ago
tmlr-group / NoisyRationales
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
☆37Updated 3 months ago
xie-lab-ml / deep-learning-dynamics-paper-list
This is a list of peer-reviewed representative papers on deep learning dynamics (optimization dynamics of neural networks). The success o…
☆290Updated last year
Guangxuan-Xiao / GSM8K-eval
☆53Updated 2 years ago
kwignb / NeuralTangentKernel-Papers
Neural Tangent Kernel Papers
☆118Updated 9 months ago
siyan-zhao / ICL_decision_boundary
official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…
☆19Updated 3 months ago
Lingkai-Kong / RE-Control
Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective
☆34Updated 8 months ago
osehmathias / lisa
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
☆35Updated last year
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆131Updated 3 months ago
Kaffaljidhmah2 / Arxiv-Recommender
☆52Updated 2 years ago
EnnengYang / AdaMerging
AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.
☆94Updated last year
ykwon0407 / DataInf
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)
☆76Updated last year
dtsip / in-context-learning
☆240Updated last year
Model-GLUE / Model-GLUE
☆18Updated last year
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆86Updated 3 weeks ago
TRAIS-Lab / dattri
`dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.
☆90Updated 2 weeks ago
zyushun / hessian-spectrum
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
☆64Updated 7 months ago
bansky-cl / diffusion-nlp-paper-arxiv
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion-LM-Papers".
☆205Updated this week