[ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents
☆101Apr 23, 2026Updated last month
Alternatives and similar repositories for IGPO
Users that are interested in IGPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents☆51Feb 2, 2026Updated 4 months ago
- [ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling☆22Dec 16, 2024Updated last year
- ☆21Feb 15, 2024Updated 2 years ago
- Repo. for RLCF.☆15Apr 1, 2024Updated 2 years ago
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆30Aug 9, 2025Updated 10 months ago
- ☆21Dec 14, 2024Updated last year
- MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning☆45Sep 3, 2025Updated 9 months ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆65Sep 24, 2024Updated last year
- ☆32Aug 21, 2025Updated 9 months ago
- Federated Reinforcement Learning☆12Jun 20, 2019Updated 6 years ago
- extension for fabric to handle prompts through pexpect☆44May 31, 2015Updated 11 years ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆26May 13, 2025Updated last year
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆26Apr 26, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆15Mar 20, 2023Updated 3 years ago
- [SIGGRAPH Asia 2025] CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling☆49Apr 17, 2026Updated 2 months ago
- AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback (NAACL 2024)☆19Aug 9, 2024Updated last year
- An online federated reinforcement learning algorithm published in INFOCOM2024☆16Dec 1, 2024Updated last year
- ☆43Jan 19, 2026Updated 4 months ago
- 原稿用紙;原稿紙;稿紙;日式便箋;UPTEX/UPLATEX 縱書☆10Nov 27, 2019Updated 6 years ago
- [ACL 2025] Adaptive Retrieval without Self-Knowledge? Bringing Uncertainty Back Home☆19May 17, 2025Updated last year
- [TPAMI 2025] Revisiting Essential and Non-Essential Settings of Evidential Deep Learning☆26Jun 24, 2025Updated 11 months ago
- [EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆32Jun 3, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆19Jul 7, 2025Updated 11 months ago
- The light codes for the paper published in JMS named 'Solving task scheduling problems in cloud manufacturing via attention mechanism and…☆20May 15, 2023Updated 3 years ago
- [EMNLP 2025] Verification Engineering for RL in Instruction Following☆56Mar 30, 2026Updated 2 months ago
- The repo of the Doc2SoarGraph framework☆10Sep 17, 2024Updated last year
- Code to minimize the Variational Contrastive Divergence (VCD)☆30May 30, 2019Updated 7 years ago
- the open-source code of QAgent☆59Oct 14, 2025Updated 8 months ago
- ☆16Nov 19, 2021Updated 4 years ago
- ☆28May 27, 2024Updated 2 years ago
- Enemies for your LLM☆37Jan 20, 2026Updated 4 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆24Jan 16, 2025Updated last year
- A curated collection of research and techniques for protecting intellectual property of large language models, including watermarking, fi…☆49Jun 10, 2026Updated last week
- Original implementation of SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback (ICLR 2025)☆18Feb 17, 2025Updated last year
- Self-Knowledge Guided Retrieval Augmentation for Large Language Models (EMNLP Findings 2023)☆27Dec 8, 2023Updated 2 years ago
- A Comprehensive Dataset for Advanced Image Generation and Editing}☆32Oct 2, 2025Updated 8 months ago
- The guideline for pod.☆10Jun 19, 2020Updated 5 years ago
- 基于Model Context Protocol (MCP)的ComfyUI图像生成服务,通过API调用本地ComfyUI实例生成图片,实现自然语言生图自由☆24Nov 30, 2025Updated 6 months ago