x66ccff / liveideabenchLinks
š¤š” LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
ā14Updated last month
Alternatives and similar repositories for liveideabench
Users that are interested in liveideabench are comparing it to the libraries listed below
Sorting:
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"ā105Updated 7 months ago
- LLM for Scientific Research Surveyā93Updated 4 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)ā36Updated 5 months ago
- Official Implementation of the Baby-AIGS systemā23Updated 6 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scalingā104Updated 4 months ago
- Evaluate the Quality of Critiqueā35Updated last year
- Process Reward Models That Thinkā38Updated last week
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award ā¦ā41Updated 7 months ago
- ā47Updated 3 months ago
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesisā61Updated this week
- Official implementation of ICML 2025 paper "Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment" (https:ā¦ā25Updated last month
- The official repo for the code and data of paper SMARTā26Updated 3 months ago
- A trainable user simulatorā34Updated 8 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"ā59Updated 4 months ago
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"ā36Updated 10 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?ā54Updated 3 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoningā48Updated 6 months ago
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discoveryā87Updated last month
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)ā57Updated 7 months ago
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Lā¦ā48Updated 11 months ago
- ā60Updated 2 weeks ago
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoningā53Updated 2 months ago
- This is the repository for paper "CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models"ā24Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?ā25Updated 2 months ago
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Dataā37Updated 3 months ago
- A framework for evolving and testing question-answering datasets with various models.ā16Updated last year
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.ā123Updated 2 months ago
- Sotopia-Ļ: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)ā65Updated last year
- This the implementation of LeCoā31Updated 4 months ago
- A curated list of papers on LLMs and agents for scientific research and developmentā57Updated 5 months ago