xiaowu0162 / LongMemEval
Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)
☆43Updated last week
Alternatives and similar repositories for LongMemEval:
Users that are interested in LongMemEval are comparing it to the libraries listed below
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆47Updated 2 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆81Updated 3 months ago
- ☆20Updated this week
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- ☆40Updated this week
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆37Updated this week
- ☆49Updated 3 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated 11 months ago
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆41Updated this week
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆72Updated 3 weeks ago
- The first dense retrieval model that can be prompted like an LM☆64Updated 4 months ago
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆24Updated 2 months ago
- ☆81Updated last year
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆57Updated 9 months ago
- ☆23Updated 4 months ago
- Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement☆74Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆54Updated 5 months ago
- ☆48Updated 3 months ago
- ☆19Updated 2 months ago
- ☆106Updated 3 weeks ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆71Updated last month
- ☆55Updated 3 months ago
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆52Updated 4 months ago
- Code and Data for "Language Modeling with Editable External Knowledge"☆31Updated 7 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆35Updated last month
- ☆66Updated last year
- ☆20Updated 8 months ago
- ☆116Updated 4 months ago