Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)
☆143Sep 21, 2024Updated last year
Alternatives and similar repositories for Google_SCoRe
Users that are interested in Google_SCoRe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆38Jul 9, 2025Updated 8 months ago
- ☆34Oct 31, 2024Updated last year
- NLP 역사부터 서빙까지 한 권의 책에서 다룹니다.☆25Dec 6, 2025Updated 3 months ago
- hllama is a library which aims to provide a set of utility tools for large language models.☆10Apr 16, 2024Updated last year
- 카카오톡 GPT☆19Apr 9, 2024Updated last year
- Repository for the paper Stream of Search: Learning to Search in Language☆154Feb 3, 2025Updated last year
- Reward Model을 이용하여 언어모델의 답변을 평가하기☆29Feb 23, 2024Updated 2 years ago
- huggingface에 있는 한국어 데이터 세트☆36Oct 10, 2024Updated last year
- GUIEvalKit: Open-source Evaluation Toolkit for GUI Agents☆19Feb 26, 2026Updated 3 weeks ago
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆17Mar 2, 2026Updated 3 weeks ago
- ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋☆41Nov 21, 2023Updated 2 years ago
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆44Jul 8, 2025Updated 8 months ago
- Distilling Task-Specific Knowledge from Teacher Model into BiLSTM☆31Dec 14, 2024Updated last year
- Full Stack SolarLLM Zero to All☆169Mar 1, 2025Updated last year
- SCoRe: Training Language Models to Self-Correct via Reinforcement Learning☆16Jan 24, 2025Updated last year
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆83Jan 14, 2025Updated last year
- ☆24Dec 2, 2023Updated 2 years ago
- GenRM-CoT: Data release for verification rationales☆67Oct 16, 2024Updated last year
- LCA-on-the-line (ICML 2024 Oral)☆13Feb 13, 2025Updated last year
- ☆102Dec 22, 2023Updated 2 years ago
- This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.☆17Sep 13, 2024Updated last year
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Mar 11, 2026Updated 2 weeks ago
- The Universe of Evaluation. All about the evaluation for LLMs.☆233Jul 9, 2024Updated last year
- This the implementation of LeCo☆32Jan 20, 2025Updated last year
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Aug 24, 2024Updated last year
- Easy Instruction for DPO training☆24Oct 9, 2024Updated last year
- A recipe for online RLHF and online iterative DPO.☆545Dec 28, 2024Updated last year
- A series of technical report on Slow Thinking with LLM☆761Aug 13, 2025Updated 7 months ago
- ☆130Jun 18, 2024Updated last year
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 4 months ago
- ☆28Mar 5, 2024Updated 2 years ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆99Apr 9, 2025Updated 11 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆73Apr 22, 2025Updated 11 months ago
- Git for "Stepwise Self-Consistent Mathematical Reasoning with Large Language Models"☆12Nov 26, 2024Updated last year
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆69Apr 14, 2025Updated 11 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆87Oct 26, 2025Updated 4 months ago
- the datasets of our paper☆11Feb 26, 2024Updated 2 years ago
- 프롬프트 엔지니어링을 하기 위한 공간을 만들어보고 싶었습니다.☆32Sep 10, 2024Updated last year
- BERT score for text generation☆12Jan 15, 2025Updated last year