IAAR-Shanghai / xVerify
xVerify: Efficient Answer Verifier for Large Language Model Evaluations
☆15Updated this week
Alternatives and similar repositories for xVerify:
Users that are interested in xVerify are comparing it to the libraries listed below
- Controllable Text Generation for Large Language Models: A Survey☆164Updated 7 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆171Updated this week
- SOTA RL fine-tuning solution for advanced math reasoning of LLM☆92Updated this week
- [ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Jou…☆29Updated 9 months ago
- Large Language Models(LLMs) of Code☆18Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆301Updated 7 months ago
- ☆216Updated this week
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆43Updated 2 weeks ago
- A curated list of personalized alignment resources (continually updated).☆12Updated this week
- Fast Memorization of Prompt Improves Context Awareness of Large Language Models (Findings of EMNLP 2024)☆20Updated 5 months ago
- ☆507Updated 2 months ago
- ☆22Updated last month
- ☆105Updated 6 months ago
- The awesome agents in the era of large language models☆60Updated last year
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- A list of awesome papers on LLM tool learning.☆22Updated 8 months ago
- [ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.☆161Updated 4 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆125Updated 3 months ago
- A series of technical report on Slow Thinking with LLM☆595Updated last week
- The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.☆321Updated 3 months ago
- A Survey on Efficient Reasoning for LLMs☆204Updated this week
- The demo, code and data of FollowRAG☆70Updated 3 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆597Updated 2 months ago
- The related works and background techniques about Openai o1☆217Updated 2 months ago
- ☆277Updated 3 weeks ago
- Awesome RL-based LLM Reasoning☆352Updated this week
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆351Updated 6 months ago
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆32Updated 3 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆110Updated 6 months ago
- The official implementation of the paper "AgentSquare: Automatic LLM Agent Search in Modular Design Space""☆165Updated 2 weeks ago