NuoJohnChen / JudgeLRM
☆24Updated 3 weeks ago
Alternatives and similar repositories for JudgeLRM
Users that are interested in JudgeLRM are comparing it to the libraries listed below
Sorting:
- ☆63Updated this week
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆35Updated 2 months ago
- ☆17Updated 4 months ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated 2 months ago
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆42Updated last week
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆43Updated 2 months ago
- Large Language Models Can Self-Improve in Long-context Reasoning☆69Updated 5 months ago
- Codebase for Instruction Following without Instruction Tuning☆34Updated 7 months ago
- Code for Heima☆42Updated 3 weeks ago
- Knowledge Unlearning for Large Language Models☆25Updated last week
- Repo for "Z1: Efficient Test-time Scaling with Code"☆58Updated last month
- ☆78Updated 3 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆29Updated last month
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 6 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆83Updated this week
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- Exploration of automated dataset selection approaches at large scales.☆40Updated 2 months ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆53Updated last week
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆49Updated 9 months ago
- ☆31Updated 4 months ago
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆25Updated last month
- Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆15Updated 2 months ago
- ☆99Updated last week
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆68Updated 6 months ago
- ☆22Updated 5 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆13Updated last month
- LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models☆17Updated last month
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆26Updated 2 months ago
- ☆20Updated 2 months ago
- ☆97Updated 2 months ago