Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)
☆143Sep 21, 2024Updated last year
Alternatives and similar repositories for Google_SCoRe
Users that are interested in Google_SCoRe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆38Jul 9, 2025Updated 9 months ago
- ☆34Oct 31, 2024Updated last year
- NLP 역사부터 서빙까지 한 권의 책에서 다룹니다.☆25Dec 6, 2025Updated 4 months ago
- hllama is a library which aims to provide a set of utility tools for large language models.☆10Apr 16, 2024Updated last year
- LLM의 다양한 튜닝 방법과 데이터 전처리 코드를 정리해놓았습니다.☆14Feb 23, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 카카오톡 GPT☆19Apr 9, 2024Updated 2 years ago
- Repository for the paper Stream of Search: Learning to Search in Language☆154Feb 3, 2025Updated last year
- Reward Model을 이용하여 언어모델의 답변을 평가하기☆29Feb 23, 2024Updated 2 years ago
- huggingface에 있는 한국어 데이터 세트☆36Oct 10, 2024Updated last year
- GUIEvalKit: Open-source Evaluation Toolkit for GUI Agents☆19Feb 26, 2026Updated last month
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆17Mar 2, 2026Updated last month
- ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋☆41Nov 21, 2023Updated 2 years ago
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆45Jul 8, 2025Updated 9 months ago
- Distilling Task-Specific Knowledge from Teacher Model into BiLSTM☆31Dec 14, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Full Stack SolarLLM Zero to All☆169Mar 1, 2025Updated last year
- 구글에서 발표한 Chain-of-Thought Reasoning without Prompting을 코드로 구현한 레포입니다.☆65Sep 28, 2024Updated last year
- SCoRe: Training Language Models to Self-Correct via Reinforcement Learning☆16Jan 24, 2025Updated last year
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆83Jan 14, 2025Updated last year
- ☆24Dec 2, 2023Updated 2 years ago
- ☆20Jul 24, 2024Updated last year
- GenRM-CoT: Data release for verification rationales☆67Oct 16, 2024Updated last year
- LCA-on-the-line (ICML 2024 Oral)☆14Feb 13, 2025Updated last year
- 파이썬 금융 데이터 분석 튜토리얼 (Python Finance Data Analysis Tutorial)☆14Sep 18, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆102Dec 22, 2023Updated 2 years ago
- This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.☆17Sep 13, 2024Updated last year
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Mar 11, 2026Updated last month
- The Universe of Evaluation. All about the evaluation for LLMs.☆235Jul 9, 2024Updated last year
- This the implementation of LeCo☆32Jan 20, 2025Updated last year
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Aug 24, 2024Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- Easy Instruction for DPO training☆24Oct 9, 2024Updated last year
- A recipe for online RLHF and online iterative DPO.☆543Dec 28, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A series of technical report on Slow Thinking with LLM☆764Aug 13, 2025Updated 8 months ago
- ☆131Jun 18, 2024Updated last year
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 5 months ago
- ☆28Mar 5, 2024Updated 2 years ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆100Apr 9, 2025Updated last year
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆73Apr 22, 2025Updated 11 months ago
- Git for "Stepwise Self-Consistent Mathematical Reasoning with Large Language Models"☆12Nov 26, 2024Updated last year