Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)
☆143Sep 21, 2024Updated last year
Alternatives and similar repositories for Google_SCoRe
Users that are interested in Google_SCoRe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…☆38Jul 9, 2025Updated 10 months ago
- ☆34Oct 31, 2024Updated last year
- NLP 역사부터 서빙까지 한 권의 책에서 다룹니다.☆25Dec 6, 2025Updated 5 months ago
- hllama is a library which aims to provide a set of utility tools for large language models.☆10Apr 16, 2024Updated 2 years ago
- LLM의 다양한 튜닝 방법과 데이터 전처리 코드를 정리해놓았습니다.☆14May 18, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Repository for the paper Stream of Search: Learning to Search in Language☆153Feb 3, 2025Updated last year
- Reward Model을 이용하여 언어모델의 답변을 평가하기☆29Feb 23, 2024Updated 2 years ago
- huggingface에 있는 한국어 데이터 세트☆36Oct 10, 2024Updated last year
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆17Mar 2, 2026Updated 2 months ago
- ChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋☆41Nov 21, 2023Updated 2 years ago
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆47Jul 8, 2025Updated 10 months ago
- Full Stack SolarLLM Zero to All☆169Mar 1, 2025Updated last year
- 구글에서 발표한 Chain-of-Thought Reasoning without Prompting을 코드로 구현한 레포입니다.☆65Sep 28, 2024Updated last year
- SCoRe: Training Language Models to Self-Correct via Reinforcement Learning☆16May 14, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆83Jan 14, 2025Updated last year
- ☆24Dec 2, 2023Updated 2 years ago
- ☆20Jul 24, 2024Updated last year
- Topology Distillation for Recommender System (KDD'21)☆13Sep 2, 2021Updated 4 years ago
- GenRM-CoT: Data release for verification rationales☆68Oct 16, 2024Updated last year
- LCA-on-the-line (ICML 2024 Oral)☆14Feb 13, 2025Updated last year
- 파이썬 금융 데이터 분석 튜토리얼 (Python Finance Data Analysis Tutorial)☆14Sep 18, 2022Updated 3 years ago
- This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.☆17Sep 13, 2024Updated last year
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Mar 11, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The Universe of Evaluation. All about the evaluation for LLMs.☆234Jul 9, 2024Updated last year
- This the implementation of LeCo☆32Jan 20, 2025Updated last year
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Aug 24, 2024Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- A recipe for online RLHF and online iterative DPO.☆545Dec 28, 2024Updated last year
- Easy Instruction for DPO training☆24Oct 9, 2024Updated last year
- A series of technical report on Slow Thinking with LLM☆765Aug 13, 2025Updated 9 months ago
- ☆130Jun 18, 2024Updated last year
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆28Mar 5, 2024Updated 2 years ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆101Apr 9, 2025Updated last year
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆73Apr 22, 2025Updated last year
- Git for "Stepwise Self-Consistent Mathematical Reasoning with Large Language Models"☆12Nov 26, 2024Updated last year
- the datasets of our paper☆11Feb 26, 2024Updated 2 years ago
- 프롬프트 엔지니어링을 하기 위한 공간을 만들어보고 싶었습니다.☆32Sep 10, 2024Updated last year
- BERT score for text generation☆12Jan 15, 2025Updated last year