allenai / chimeLinks
Repository containing dataset, models and code associated with the CHIME project
☆17Updated last year
Alternatives and similar repositories for chime
Users that are interested in chime are comparing it to the libraries listed below
Sorting:
- Groq-powered MAD: The first work to explore Multi-Agent Debate with Large Language Models :D☆12Updated last year
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆96Updated 2 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆143Updated last year
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆69Updated last year
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆53Updated 7 months ago
- ☆43Updated last year
- The official repo for the code and data of paper SMART☆38Updated 11 months ago
- MIRIAD is a million-scale Medical Instruction and Retrieval Datatset☆142Updated 2 months ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆83Updated last year
- ☆39Updated last year
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆124Updated 5 months ago
- Analysis code for Neurips 2025 paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"☆56Updated 6 months ago
- This repository contains expert evaluation interface and data evaluation script for the OpenScholar project.☆32Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆132Updated last week
- II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset☆31Updated 10 months ago
- Specification for creating reliable LLM-based conversational agents☆65Updated 3 months ago
- ☆67Updated 10 months ago
- [ICML 2024 Oral] A framework for society simulation that supports complex simulation, for example: multi-scene.☆84Updated last year
- Official code repository for: DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning (COLM 2024)☆155Updated last year
- 🤝 The code for "Can Large Language Model Agents Simulate Human Trust Behaviors?"☆109Updated 10 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆92Updated last year
- Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.☆99Updated last year
- Official repo of Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains.☆44Updated 8 months ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆121Updated 2 months ago
- An attribution library for LLMs☆46Updated last year
- A virtual environment for developing and evaluating automated scientific discovery agents.☆199Updated 11 months ago
- CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.☆48Updated 3 months ago
- ☆72Updated 3 months ago
- Training Proactive and Personalized LLM Agents☆100Updated 3 weeks ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Updated last year