LeonDiao0427 / SEASLinks
We release our code and data for SEAS in this repository.
☆20Updated 9 months ago
Alternatives and similar repositories for SEAS
Users that are interested in SEAS are comparing it to the libraries listed below
Sorting:
- The reinforcement learning codes for dataset SPA-VL☆36Updated last year
- ☆35Updated 11 months ago
- A comprehensive collection of process reward models.☆108Updated 2 months ago
- The official GitHub repository of the paper "Recent advances in large langauge model benchmarks against data contamination: From static t…☆45Updated 2 weeks ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆161Updated 6 months ago
- ☆22Updated 11 months ago
- This is the repository of DEER, a Dynamic Early Exit in Reasoning method for Large Reasoning Language Models.☆171Updated 2 months ago
- ☆51Updated last year
- ☆269Updated 2 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆56Updated 9 months ago
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆59Updated 9 months ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆17Updated 4 months ago
- Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.☆61Updated last year
- S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models☆96Updated 2 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆246Updated last month
- Official repository for "Safety in Large Reasoning Models: A Survey" - Exploring safety risks, attacks, and defenses for Large Reasoning …☆70Updated last month
- ☆49Updated 7 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆26Updated 7 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆69Updated 5 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆136Updated 2 months ago
- ☆82Updated last year
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆81Updated 3 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆185Updated 8 months ago
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆184Updated 4 months ago
- Extrapolating RLVR to General Domains without Verifiers☆163Updated last month
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆86Updated 7 months ago
- [ACL 2025] A Neural-Symbolic Self-Training Framework☆113Updated 3 months ago
- [ICML 2025] Official Implementation of GLIDER☆57Updated 4 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆50Updated 2 months ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"☆24Updated 3 weeks ago