[ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
☆76Jun 25, 2025Updated 8 months ago
Alternatives and similar repositories for Layer_Gradient
Users that are interested in Layer_Gradient are comparing it to the libraries listed below
Sorting:
- [ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning☆20Sep 27, 2025Updated 5 months ago
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆17Mar 11, 2025Updated 11 months ago
- [NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling☆14Sep 27, 2025Updated 5 months ago
- Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning☆52Oct 17, 2025Updated 4 months ago
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated 10 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Aug 5, 2025Updated 6 months ago
- [ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing☆29Feb 6, 2026Updated 3 weeks ago
- ☆49Apr 4, 2025Updated 10 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 7 months ago
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…☆26Aug 23, 2025Updated 6 months ago
- ☆14Apr 14, 2025Updated 10 months ago
- An Open-source Factuality Evaluation Demo for LLMs☆32Feb 23, 2026Updated last week
- [ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models"☆13Aug 6, 2024Updated last year
- A new multi-task learning framework using Vision Transformers☆11Jun 19, 2024Updated last year
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)☆11Aug 11, 2025Updated 6 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- [MICCAI 2024] Official code for the paper "MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation"☆14Nov 1, 2024Updated last year
- Official repository for Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning☆12Sep 2, 2024Updated last year
- Code for paper: Optimizing Length Compression in Large Reasoning Models☆27Oct 20, 2025Updated 4 months ago
- [MICCAI 2025] Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology☆12Jun 17, 2025Updated 8 months ago
- LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks☆54Sep 28, 2025Updated 5 months ago
- Official code repository of paper titled "Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Visio…☆32May 11, 2025Updated 9 months ago
- ☆43May 6, 2024Updated last year
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated last month
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆29Dec 24, 2025Updated 2 months ago
- AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents☆62Updated this week
- LMM for VQA, tcsvt version☆11Jul 19, 2024Updated last year
- Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…☆13Jan 16, 2025Updated last year
- ☆12Dec 4, 2024Updated last year
- ☆21Oct 22, 2025Updated 4 months ago
- [ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"☆10Jul 24, 2022Updated 3 years ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- [ICCV 2025] Diffusion Curriculum (DisCL)☆18Sep 26, 2025Updated 5 months ago
- An official repository for GPTailor☆17Jun 29, 2025Updated 8 months ago
- Statewide Visual Geolocalization in the Wild (ECCV 2024)☆73Dec 2, 2024Updated last year
- Kaggle AIMO2 solution with token-efficient reasoning LLM recipes☆42Aug 7, 2025Updated 6 months ago
- [ICLR'25 Oral] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models☆35Nov 3, 2024Updated last year
- [ECCV 2024] Soft Prompt Generation for Domain Generalization☆31Oct 1, 2024Updated last year