MingLiiii/Layer_Gradient

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MingLiiii/Layer_Gradient)

MingLiiii / Layer_Gradient

[ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

☆75

Alternatives and similar repositories for Layer_Gradient

Users that are interested in Layer_Gradient are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tianyi-lab / Mosaic-IT
View on GitHub
[ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning
☆20Sep 27, 2025Updated 9 months ago
tianyi-lab / RuleR
View on GitHub
[NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling
☆14Sep 27, 2025Updated 9 months ago
MingLiiii / Gradient_Unified
View on GitHub
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
☆20Jun 17, 2025Updated last year
tianyi-lab / Moltbook_Socialization
View on GitHub
Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook
☆18Feb 17, 2026Updated 5 months ago
TianheL / LM-Implicit-Reasoning
View on GitHub
[ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts
☆18Mar 11, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Lichang-Chen / claude2-alpaca
View on GitHub
First instruction-tuning dataset distilled from Claude2 (52k Alpaca prompts)!
☆13Oct 22, 2023Updated 2 years ago
kai-wen-yang / IDAA
View on GitHub
[ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"
☆10Jul 24, 2022Updated 3 years ago
tianyi-lab / MiP-Overthinking
View on GitHub
[COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
☆39Jun 5, 2025Updated last year
jiaangli / VILA
View on GitHub
[TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
☆16Nov 22, 2024Updated last year
fferflo / statewide-visual-geolocalization
View on GitHub
Statewide Visual Geolocalization in the Wild (ECCV 2024)
☆75Dec 2, 2024Updated last year
CERT-Lab / lora-sb
View on GitHub
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
☆52Oct 17, 2025Updated 9 months ago
tianyi-lab / DEBATunE
View on GitHub
[ACL'24] Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
☆25Sep 14, 2024Updated last year
tianyi-lab / RoMA
View on GitHub
Code for "Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs"
☆19Nov 6, 2025Updated 8 months ago
VILA-Lab / DELT
View on GitHub
(CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…
☆28Aug 23, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xirui-li / MOSSBench
View on GitHub
An implementation for MLLM oversensitivity evaluation
☆18Nov 16, 2024Updated last year
Zoeyyao27 / SirLLM
View on GitHub
This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM
☆60May 28, 2024Updated 2 years ago
tianyi-lab / ColorBench
View on GitHub
[NeurIPS'25] ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and R…
☆40Sep 27, 2025Updated 9 months ago
wuxiyang1996 / AutoHallusion
View on GitHub
AutoHallusion Codebase (EMNLP 2024)
☆23Dec 6, 2024Updated last year
Pixtella / Anagram-MTL
View on GitHub
[WACV 2025] Official implementation for the paper "Diffusion-based Visual Anagram as Multi-task Learning"
☆59May 25, 2025Updated last year
open-compass / GPassK
View on GitHub
[ACL 2025] Are Your LLMs Capable of Stable Reasoning?
☆33Aug 5, 2025Updated 11 months ago
zxiangx / LC-R1
View on GitHub
Code for paper: Optimizing Length Compression in Large Reasoning Models
☆29Oct 20, 2025Updated 9 months ago
tianyi-lab / R2-T2
View on GitHub
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
☆19Mar 10, 2025Updated last year
Yangyi-Chen / CoTConsistency
View on GitHub
The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".
☆34Sep 16, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tianyi-lab / FaSTAR
View on GitHub
[ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing
☆33May 30, 2026Updated last month
shenao-zhang / reward-augmented-preference
View on GitHub
The official implementation of Preference Data Reward-Augmentation.
☆18May 1, 2025Updated last year
mwatkins1970 / SAE_Feature_Interpretability_Tool
View on GitHub
A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom o…
☆19Oct 4, 2024Updated last year
hananshafi / MedContext
View on GitHub
[MICCAI 2024] Official code for the paper "MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation"
☆14Nov 1, 2024Updated last year
prs-eth / LoRA-Ensemble
View on GitHub
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks
☆55Mar 7, 2026Updated 4 months ago
tianyi-lab / C3PO
View on GitHub
[COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"
☆21Apr 9, 2025Updated last year
allenai / super-benchmark
View on GitHub
☆53Apr 4, 2025Updated last year
tianyi-lab / Superfiltering
View on GitHub
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆189Jun 25, 2025Updated last year
pixeli99 / MixLN
View on GitHub
[ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…
☆30Jul 24, 2025Updated 11 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
LoserCheems / WonderfulMatrices
View on GitHub
Wonderful Matrices to Build Small Language Models
☆44Feb 15, 2025Updated last year
HashmatShadab / HSAT
View on GitHub
[MICCAI 2025] Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology
☆12Jun 17, 2025Updated last year
efficientscaling / Z1
View on GitHub
[EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"
☆69Apr 11, 2025Updated last year
hananshafi / MTL-ViT
View on GitHub
A new multi-task learning framework using Vision Transformers
☆11Jun 19, 2024Updated 2 years ago
divyakraman / AerialDiffusion
View on GitHub
Codebase for the paper Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
☆13Oct 3, 2023Updated 2 years ago
UKPLab / arxiv2025-inherent-limits-plms
View on GitHub
Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…
☆14Jan 16, 2025Updated last year
MadryLab / bias-transfer
View on GitHub
☆15Jul 24, 2022Updated 3 years ago