MingLiiii / Layer_Gradient
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
☆52Updated 2 months ago
Alternatives and similar repositories for Layer_Gradient:
Users that are interested in Layer_Gradient are comparing it to the libraries listed below
- The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"☆52Updated 8 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆61Updated last month
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆88Updated 3 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆106Updated 8 months ago
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆117Updated 5 months ago
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆61Updated last month
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆36Updated last month
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning☆39Updated 2 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆61Updated last month
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆40Updated last month
- SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights☆45Updated 3 months ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]☆48Updated this week
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆126Updated 2 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆154Updated 3 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆97Updated 2 weeks ago
- ☆69Updated this week
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆97Updated 6 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆77Updated 2 months ago
- ☆56Updated 4 months ago
- The HELMET Benchmark☆103Updated this week
- Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)☆73Updated 3 months ago
- ☆131Updated 7 months ago
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆76Updated 6 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆97Updated 3 months ago
- [ATTRIB @ NeurIPS 2024 Oral] When Attention Sink Emerges in Language Models: An Empirical View☆43Updated 3 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆75Updated 3 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆71Updated 7 months ago
- ☆93Updated 6 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆120Updated last month
- This is the official repository for Inheritune.☆108Updated 3 months ago