โ17Nov 3, 2024Updated last year
Alternatives and similar repositories for prm
Users that are interested in prm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.โ61Oct 3, 2024Updated last year
- [๐๐๐๐๐ ๐ ๐ข๐ง๐๐ข๐ง๐ ๐ฌ ๐๐๐๐ & ๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐ซ๐๐ฅ] ๐๐ฏ๐ฉ๐ข๐ฏ๐ค๐ช๐ฏ๐จ ๐๐ข๐ต๐ฉ๐ฆ๐ฎ๐ข๐ต๐ช๐ค๐ข๐ญ ๐๐ฆ๐ข๐ด๐ฐ๐ฏ๐ช๐ฏโฆโ51May 4, 2024Updated last year
- โ27Apr 11, 2023Updated 2 years ago
- โ20Dec 14, 2024Updated last year
- ๐ป Terminal-Agent with Human-in-the-Loop Learningโ39Jan 16, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean โข AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementations of Influential Recommender Systemโ11Oct 29, 2024Updated last year
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ121Dec 10, 2024Updated last year
- ้ๅฏนๆ็ปๅ ธ็่กจๆ ผๅQ learning็ฎๆณ่ฟ่กไบๅค็ฐ๏ผ่ฝๅคๆฏๆgymไธญๅคงๅคๆฐ็็ฆปๆฃๅจไฝๅ็ถๆ็ฉบ้ด็็ฏๅข๏ผ่ญฌๅฆCliffWalking-v0ใโ10Jan 2, 2021Updated 5 years ago
- The official repository for paper "MLLM-Protector: Ensuring MLLMโs Safety without Hurting Performance"โ45Apr 21, 2024Updated last year
- A iOS and watchOS focus timer app ๐โ32Oct 27, 2024Updated last year
- โ39Jun 25, 2025Updated 9 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningโ123May 6, 2025Updated 10 months ago
- AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agentsโ51Jan 28, 2025Updated last year
- Fetch a random wallpaper from Konachan.โ10Jun 4, 2018Updated 7 years ago
- Managed Kubernetes at scale on DigitalOcean โข AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ICLR'26] Stronger-MAS: A RL Framework for multi LLM agent systemโ136Updated this week
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradientโ66Aug 3, 2025Updated 7 months ago
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"โ17Mar 2, 2026Updated 3 weeks ago
- Align, a general text alignment functionโ15Dec 7, 2023Updated 2 years ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!โ72Apr 1, 2025Updated 11 months ago
- Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Populaโฆโ11Oct 18, 2022Updated 3 years ago
- Repo for EmbedLLM: Learning Compact Representations of Large Language Modelsโ29Sep 25, 2025Updated 6 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don'tโฆโ133Jul 10, 2024Updated last year
- Codes and data for CIKM 2022 paper "RuDi: Explaining Behavior Sequence Models by Automatic Statistics Generation and Rule Distillation"โ12Aug 16, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off โข AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- โ39May 2, 2024Updated last year
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AIโ107Mar 6, 2025Updated last year
- Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)โ50Apr 19, 2024Updated last year
- โ16May 16, 2025Updated 10 months ago
- โ60Nov 18, 2024Updated last year
- Repository for Skill Set Optimizationโ14Jul 26, 2024Updated last year
- โ13Jun 17, 2024Updated last year
- Data and code for paper "ODSum: New Benchmarks for Open Domain Multi-Document Summarization"โ11Sep 20, 2024Updated last year
- Code Repository for the EMCL-PKDD 2021 "Multitask Recalibrated Aggregation Network for Medical Code Prediction)โ13Sep 7, 2021Updated 4 years ago
- NordVPN Special Discount Offer โข AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- โ30Dec 27, 2024Updated last year
- โ16Mar 22, 2024Updated 2 years ago
- Implementation of AdaCQR(COLING 2025)โ13Dec 30, 2024Updated last year
- LLM as World Models using Bayesian inferenceโ17May 27, 2025Updated 9 months ago
- Code and data for the paper "Steering Conversational Large Language Models for Long Emotional Support Conversations" along with a UI to vโฆโ15Apr 14, 2025Updated 11 months ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".โ83Jan 14, 2025Updated last year
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approachโ32Nov 6, 2023Updated 2 years ago