MingLiiii / Layer_Gradient
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
☆36Updated last week
Related projects ⓘ
Alternatives and complementary repositories for Layer_Gradient
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆114Updated this week
- ☆62Updated last month
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆102Updated 6 months ago
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆109Updated 2 months ago
- The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"☆50Updated 6 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆143Updated 3 weeks ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆57Updated last month
- ☆89Updated 4 months ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆66Updated 5 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆91Updated 4 months ago
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆122Updated 2 weeks ago
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆111Updated last week
- ☆126Updated 5 months ago
- This is the official repository for Inheritune.☆105Updated last month
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆67Updated 4 months ago
- E5-V: Universal Embeddings with Multimodal Large Language Models☆169Updated 3 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆129Updated last month
- ☆150Updated 9 months ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆153Updated 3 months ago
- Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning☆32Updated 3 weeks ago
- A simple unified framework for evaluating LLMs☆138Updated this week
- SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights☆34Updated 3 weeks ago
- Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models☆80Updated 7 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆78Updated 8 months ago
- Code & Dataset for Paper: "Distill Visual Chart Reasoning Ability from LLMs to MLLMs"☆29Updated 2 weeks ago
- The official implementation of Self-Exploring Language Models (SELM)☆56Updated 5 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆75Updated 3 weeks ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆54Updated 3 months ago
- ☆116Updated 5 months ago
- Official repository for paper "GTA: A Benchmark for General Tool Agents" (NeurIPS 2024 D&B Track)☆43Updated this week