☆17Nov 3, 2024Updated last year
Alternatives and similar repositories for prm
Users that are interested in prm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.☆64Oct 3, 2024Updated last year
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆24Jul 30, 2024Updated last year
- ☆20Dec 14, 2024Updated last year
- The Code for the EMNLP 2023 main conference paper "Prompt-based Logical Semantics Enhancement for Implicit Discourse Relation Recognition…☆13Dec 10, 2023Updated 2 years ago
- Implementations of Influential Recommender System☆12Oct 29, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 针对最经典的表格型Q learning算法进行了复现,能够支持gym中大多数的离散动作和状态空间的环境,譬如CliffWalking-v0。☆10Jan 2, 2021Updated 5 years ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆46Apr 21, 2024Updated 2 years ago
- ☆45Jun 25, 2025Updated 11 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆126May 6, 2025Updated last year
- AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents☆57Jan 28, 2025Updated last year
- [Findings@ACL'26] LLMRouterBench: A Massive Benchmark and Unified Framework for LLM Routing☆60Apr 6, 2026Updated last month
- Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)☆23Jun 2, 2025Updated 11 months ago
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆67Aug 3, 2025Updated 9 months ago
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆17Mar 2, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Align, a general text alignment function☆15Dec 7, 2023Updated 2 years ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆72Apr 1, 2025Updated last year
- Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Popula…☆11Oct 18, 2022Updated 3 years ago
- Repo for EmbedLLM: Learning Compact Representations of Large Language Models☆31Sep 25, 2025Updated 8 months ago
- Neural theorem proving evaluation via the Lean REPL☆23Jul 12, 2025Updated 10 months ago
- ☆39May 2, 2024Updated 2 years ago
- [ICLR'26] Stronger-MAS: A RL Framework for multi LLM agent system; [arxiv] MetaAgent-X: End-to-End Reinforcement Learning Automatic Mult…☆170May 15, 2026Updated last week
- Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)☆51Apr 19, 2024Updated 2 years ago
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆106Mar 6, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆13Jun 17, 2024Updated last year
- Data and code for paper "ODSum: New Benchmarks for Open Domain Multi-Document Summarization"☆11Sep 20, 2024Updated last year
- Code Repository for the EMCL-PKDD 2021 "Multitask Recalibrated Aggregation Network for Medical Code Prediction)☆13Sep 7, 2021Updated 4 years ago
- [AAAI 2026] The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants☆46Dec 11, 2025Updated 5 months ago
- ☆30Dec 27, 2024Updated last year
- ☆17Mar 22, 2024Updated 2 years ago
- This is a repository for paper titled, PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Plann…☆14Nov 3, 2023Updated 2 years ago
- Implementation of AdaCQR(COLING 2025)☆15Dec 30, 2024Updated last year
- LLM as World Models using Bayesian inference☆18May 27, 2025Updated 11 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆83Jan 14, 2025Updated last year
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach☆32Nov 6, 2023Updated 2 years ago
- UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience☆68Apr 3, 2026Updated last month
- [EMNLP 2025 Findings] Retrieval-Augmented Machine Translation with Unstructured Knowledge☆15Sep 4, 2025Updated 8 months ago
- ICDM2022大规模电商图上的风险商品检测比赛(第六名)☆19Sep 21, 2022Updated 3 years ago
- ENACT is a benchmark that evaluates embodied cognition through world modeling from egocentric interaction. It is designed to be simple an…☆50Nov 27, 2025Updated 5 months ago
- Enhancing contextual understanding in large language models through contrastive decoding☆20May 3, 2024Updated 2 years ago