waltonfuture / Diff-eRankLinks
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆51Updated 2 months ago
Alternatives and similar repositories for Diff-eRank
Users that are interested in Diff-eRank are comparing it to the libraries listed below
Sorting:
- ☆117Updated 4 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆81Updated 5 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆107Updated last month
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆83Updated 8 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆76Updated 4 months ago
- ☆140Updated 2 months ago
- ☆323Updated last week
- A Sober Look at Language Model Reasoning☆81Updated last month
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆120Updated last month
- One-shot Entropy Minimization☆175Updated last month
- ☆67Updated last month
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆75Updated last month
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆171Updated last week
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆88Updated 7 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆52Updated last month
- ☆126Updated 2 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆86Updated 10 months ago
- [NeurIPS 2024] MATH-Vision dataset and code to measure multimodal mathematical reasoning capabilities.☆111Updated 2 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆32Updated 3 months ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆68Updated 2 weeks ago
- ☆50Updated 5 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆40Updated last year
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆106Updated this week
- ☆155Updated 2 months ago
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆162Updated this week
- repo for paper https://arxiv.org/abs/2504.13837☆180Updated last month
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆47Updated 3 months ago
- ☆101Updated last month
- 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆86Updated 8 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆121Updated last month