David-Li0406 / SMoALinks
☆14Updated last year
Alternatives and similar repositories for SMoA
Users that are interested in SMoA are comparing it to the libraries listed below
Sorting:
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated last year
- ☆23Updated last year
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Updated last year
- ☆16Updated 8 months ago
- [EMNLP 2025] WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning☆70Updated 2 months ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆21Updated 11 months ago
- ☆16Updated last year
- ☆47Updated 3 months ago
- [ICML 2025] Official resources of "KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search".☆34Updated last month
- ☆28Updated 2 months ago
- ☆19Updated 10 months ago
- ☆59Updated 2 weeks ago
- Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue (ACL 2024)☆24Updated 3 months ago
- ☆16Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Updated 5 months ago
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆64Updated last month
- ☆43Updated 5 months ago
- This the implementation of LeCo☆31Updated last year
- ☆28Updated 9 months ago
- ☆17Updated 5 months ago
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Updated last year
- Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs☆26Updated 2 months ago
- [ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling☆20Updated last year
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"☆19Updated last year
- ☆35Updated 3 months ago
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17Updated 8 months ago
- Source code of paper: Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning☆45Updated 7 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆53Updated 7 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆60Updated 7 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆25Updated 5 months ago