waltonfuture / Diff-eRankLinks
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆46Updated 3 weeks ago
Alternatives and similar repositories for Diff-eRank
Users that are interested in Diff-eRank are comparing it to the libraries listed below
Sorting:
- A Sober Look at Language Model Reasoning☆74Updated last week
- ☆109Updated 3 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆72Updated 3 months ago
- ☆43Updated 3 months ago
- One-shot Entropy Minimization☆149Updated last week
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆70Updated 7 months ago
- ☆62Updated last week
- ARM: Adaptive Reasoning Model☆40Updated last week
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 8 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆37Updated 3 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 4 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆60Updated 3 weeks ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆47Updated last week
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆30Updated last month
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆75Updated 3 weeks ago
- ☆46Updated 4 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆73Updated 4 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆80Updated 6 months ago
- ☆46Updated 2 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆88Updated last month
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆88Updated 8 months ago
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆26Updated last year
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆39Updated last year
- Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆59Updated 2 weeks ago
- This the implementation of LeCo☆31Updated 5 months ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆21Updated last year
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆25Updated last week
- ☆29Updated 2 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆35Updated 11 months ago
- The code and data for the paper JiuZhang3.0☆47Updated last year