VILA-Lab / Open-LLM-LeaderboardLinks
Open-LLM-Leaderboard: Open-Style Question Evaluation. Paper at https://arxiv.org/abs/2406.07545
☆46Updated last year
Alternatives and similar repositories for Open-LLM-Leaderboard
Users that are interested in Open-LLM-Leaderboard are comparing it to the libraries listed below
Sorting:
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆28Updated 8 months ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆52Updated 6 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆51Updated 2 months ago
- [ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆71Updated last month
- [ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teaches☆53Updated 5 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆116Updated last year
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆105Updated 5 months ago
- ☆118Updated 4 months ago
- ☆65Updated last year
- ☆103Updated 8 months ago
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆27Updated last year
- ☆127Updated 2 months ago
- Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]☆32Updated last year
- ☆13Updated last year
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆54Updated 8 months ago
- Test-time preferenece optimization (ICML 2025).☆158Updated 3 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆38Updated last year
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆123Updated 3 months ago
- ☆142Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆108Updated 3 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆24Updated last year
- An Easy-to-use Hallucination Detection Framework for LLMs.☆60Updated last year
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆81Updated last year
- Code implementation of synthetic continued pretraining☆123Updated 7 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆127Updated 4 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆153Updated 5 months ago
- Codebase for Instruction Following without Instruction Tuning☆35Updated 10 months ago
- ☆50Updated 5 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆40Updated 5 months ago