☆17Dec 23, 2025Updated 2 months ago
Alternatives and similar repositories for self-verification
Users that are interested in self-verification are comparing it to the libraries listed below
Sorting:
- ☆19Oct 27, 2025Updated 4 months ago
- Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆18Oct 5, 2024Updated last year
- The implementation of IJCAI'22 paper "Multi-Agent Concentrative Coordination with Decentralized Task Representation".☆18May 1, 2022Updated 3 years ago
- Author's PyTorch implementation of ICML'23 paper "Policy Regularization with Dataset Constraint for Offline Reinforcement Learning" for D…☆18Nov 8, 2024Updated last year
- ☆24Jul 26, 2025Updated 7 months ago
- Official implementation of NeurIPS22 paper “Multi-agent Dynamic Algorithm Configuration”☆26Mar 6, 2023Updated 2 years ago
- A python module designed for agile RL algorithm developing.☆26Jul 11, 2024Updated last year
- Source code of paper: Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning☆45Jun 24, 2025Updated 8 months ago
- [ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆47Jul 18, 2025Updated 7 months ago
- A beginner-friendly repository on Deep Reinforcement Learning (RL), written in PyTorch.☆26Jan 27, 2026Updated last month
- Source code for the paper "Memory-Efficient Fine-Tuning via Low-Rank Activation Compression"☆13Aug 1, 2025Updated 7 months ago
- This is the code of a agentic rag method with dynamic workflow.☆12Jan 22, 2026Updated last month
- ☆10Sep 29, 2024Updated last year
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 7 months ago
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO☆79Oct 29, 2025Updated 4 months ago
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆18Feb 14, 2026Updated 2 weeks ago
- Official Repository of "Learning what reinforcement learning can't"☆79Dec 30, 2025Updated 2 months ago
- ☆10Oct 15, 2020Updated 5 years ago
- ☆18Mar 2, 2025Updated last year
- End-to-end implementation of the Social Graph Network (SGN), described in the Structural Reasoning for Image-based Social Relation Recogn…☆13Apr 3, 2024Updated last year
- Brax + Pufferlib + CARBS for gpu-accelerated robotics RL☆12Jun 12, 2025Updated 8 months ago
- ☆11Jun 12, 2024Updated last year
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Apr 18, 2025Updated 10 months ago
- 云任务调度仿真平台☆12Mar 11, 2020Updated 5 years ago
- Format your bibtex (.bib) file to help standardize citations for conference and journal submissions☆14Nov 23, 2025Updated 3 months ago
- ☆12Mar 1, 2025Updated last year
- ☆12May 14, 2024Updated last year
- ☆10Jul 11, 2022Updated 3 years ago
- [NeurIPS 2024] Official code for the paper 'RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier'☆14Aug 22, 2025Updated 6 months ago
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆14Jun 28, 2025Updated 8 months ago
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- ☆10Jan 28, 2024Updated 2 years ago
- The code for ”T-GRAG: A Dynamic GraphRAG Framework for Resolving Temporal Conflicts and Redundancy in Knowledge Retrieval“☆20Jul 30, 2025Updated 7 months ago
- Code and data release of the paper Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows☆14Oct 4, 2024Updated last year
- Imitation learning from multiple experts☆13Aug 29, 2022Updated 3 years ago
- ☆29Oct 8, 2025Updated 4 months ago
- A prototype agent with the purpose of evaluating the performance of a Large Language Model within a python terminal.☆13Aug 28, 2023Updated 2 years ago
- MPI Code Generation through Domain-Specific Language Models☆14Nov 19, 2024Updated last year
- ☆16Oct 11, 2025Updated 4 months ago