pzs19 / LEMMALinks
☆14Updated 2 weeks ago
Alternatives and similar repositories for LEMMA
Users that are interested in LEMMA are comparing it to the libraries listed below
Sorting:
- ☆10Updated 5 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆29Updated 2 months ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆19Updated 6 months ago
- The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"☆34Updated 3 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆20Updated last month
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated 9 months ago
- ☆22Updated last year
- ☆14Updated 9 months ago
- [EMNLP 2025] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward☆47Updated last month
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆17Updated 3 weeks ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆51Updated 9 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆61Updated 3 months ago
- ☆21Updated 5 months ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆60Updated 3 months ago
- ☆13Updated 7 months ago
- ☆45Updated last week
- The official repository of paper "AdaR1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆18Updated 4 months ago
- ☆23Updated 9 months ago
- The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆12Updated 2 weeks ago
- ☆21Updated 4 months ago
- Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation☆36Updated 2 months ago
- A comprehensive and efficient long-context model evaluation framework☆18Updated last month
- ☆24Updated 4 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆36Updated 3 weeks ago
- [AAAI 2024] SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research☆29Updated last year
- ☆25Updated 10 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆30Updated last month
- ☆18Updated last month
- [🏆AAAI2025] Official Repo for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area.☆53Updated 2 weeks ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆128Updated 5 months ago