MingLiiii / Layer_Gradient

What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
36Updated last week

Related projects

Alternatives and complementary repositories for Layer_Gradient