SCoRe: Training Language Models to Self-Correct via Reinforcement Learning
☆16Jan 24, 2025Updated last year
Alternatives and similar repositories for SCoRe
Users that are interested in SCoRe are comparing it to the libraries listed below
Sorting:
- Concise Reasoning via Reinforcement Learning☆13Apr 16, 2025Updated 11 months ago
- ☆12May 16, 2025Updated 10 months ago
- ☆14Apr 18, 2020Updated 5 years ago
- The first large scale formally verified reasoning dataset for Verilog☆21May 16, 2025Updated 10 months ago
- The official source code for HyGCL-AdT that is published to WWW 24.☆12Mar 12, 2024Updated 2 years ago
- [NAACL 2025] MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning☆19May 31, 2025Updated 9 months ago
- Container Virtual Service☆13Aug 10, 2022Updated 3 years ago
- A simple, elegant web tool that allows you to create custom RSS feeds for arXiv search queries. Stay up-to-date with the latest research …☆33Dec 5, 2025Updated 3 months ago
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated 11 months ago
- JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf☆21May 5, 2025Updated 10 months ago
- Pretraining summarization models using a corpus of nonsense☆13Sep 28, 2021Updated 4 years ago
- ☆16Feb 28, 2025Updated last year
- [NAACL 2025] LLM-Supported Natural Language to Bash Translation☆16Jul 17, 2025Updated 8 months ago
- A simple utility for doing RISC-V HPM perf monitoring.☆18May 8, 2017Updated 8 years ago
- Framwork for the work "Large Language Models for Zero Touch Network Configuration Management"☆13Jun 20, 2024Updated last year
- Unofficial Implementation of Selective Attention Transformer☆21Oct 31, 2024Updated last year
- pdf to markdown with Python3☆11Oct 30, 2019Updated 6 years ago
- Collection of datasets for network research.☆14Jul 26, 2020Updated 5 years ago
- ☆14Oct 12, 2024Updated last year
- [ACL 2023] Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation.☆10Dec 19, 2024Updated last year
- MPLS VPNs (VPLS, VPWS, L3VPN) on eNSP using Huawei Routers☆11Feb 11, 2020Updated 6 years ago
- ☆14Apr 16, 2024Updated last year
- ☆11Jun 16, 2024Updated last year
- Code for the "Long Context Needs Some R&R" paper.☆12Mar 11, 2024Updated 2 years ago
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆40May 2, 2024Updated last year
- ☆17Apr 11, 2025Updated 11 months ago
- ⚠️ ARCHIVED - All development moved to https://github.com/itbench-hub/ITBench/tree/main/scenarios☆15Feb 24, 2026Updated 3 weeks ago
- ☆52Feb 12, 2025Updated last year
- lol助手秒选亚索☆12Jun 12, 2022Updated 3 years ago
- ☆60Mar 8, 2026Updated 2 weeks ago
- Analytical chemistry and epidemiology of street drugs☆25Aug 26, 2025Updated 6 months ago
- ☆22Oct 22, 2024Updated last year
- ☆28Jan 4, 2026Updated 2 months ago
- ☆27Nov 25, 2025Updated 3 months ago
- This project implements a Reinforcement Learning (RL) enhanced Retrieval-Augmented Generation (RAG) system that optimizes document retrie…☆23Apr 27, 2025Updated 10 months ago
- Learning Protein-Ligand Properties with Atomic Environment Vectors☆10Apr 19, 2024Updated last year
- ☆11Oct 3, 2021Updated 4 years ago
- Example of Langchain-Elasticsearch integrations & RAG.☆12Sep 20, 2024Updated last year
- Build RAG for free with local LLMs using Ollama☆13Apr 22, 2024Updated last year