caixd-220529 / LifelongAgentBenchLinks
Code repo for "LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners"
☆58Updated 7 months ago
Alternatives and similar repositories for LifelongAgentBench
Users that are interested in LifelongAgentBench are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆50Updated last year
- ☆73Updated last year
- ☆153Updated 7 months ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆101Updated last year
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆46Updated 6 months ago
- ☆53Updated 10 months ago
- ☆299Updated 6 months ago
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆161Updated 7 months ago
- ☆41Updated 4 months ago
- ☆95Updated 9 months ago
- [ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"☆57Updated 8 months ago
- A research repo for experiments about Reinforcement Finetuning☆53Updated 9 months ago
- ☆111Updated 6 months ago
- A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".☆42Updated 2 years ago
- Official implementation of MATPO: Multi-Agent Tool-Integrated Policy Optimization.☆66Updated 2 months ago
- A comprehensive collection of process reward models.☆131Updated 3 months ago
- ☆24Updated 9 months ago
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆52Updated 5 months ago
- A Survey of Personalization: From RAG to Agent☆94Updated 5 months ago
- [NeurIPS 2024] GITA: Graph to Image-Text Integration for Vision-Language Graph Reasoning☆53Updated last month
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆82Updated 2 months ago
- [NeurIPS 2024] Code and Data Repo for Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆27Updated last year
- VeriWeb: Verifiable Long-Chain Web Benchmark for Agentic Information-Seeking☆83Updated last month
- ☆22Updated 3 months ago
- [NAACL 25 main] Awesome LLM Causal Reasoning is a collection of LLM-based casual reasoning works, including papers, codes and datasets.☆111Updated 3 months ago
- Official code for the paper: DRA-GRPO: Exploring Diversity-Aware Reward Adjustment for R1-Zero-Like Training of Large Language Models☆21Updated this week
- ☆28Updated last year
- SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data☆19Updated 4 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆61Updated 3 months ago
- ☆68Updated 11 months ago