x66ccff / liveideabenchLinks
π€π‘ LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
β16Updated 3 weeks ago
Alternatives and similar repositories for liveideabench
Users that are interested in liveideabench are comparing it to the libraries listed below
Sorting:
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"β136Updated last year
- β61Updated 7 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"β60Updated last year
- A collection of resources and papers on AI Scientist / Robot Scientistβ111Updated 2 months ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award β¦β42Updated last year
- β46Updated 2 months ago
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discoveryβ112Updated 3 months ago
- SSRL: Self-Search Reinforcement Learningβ158Updated 3 months ago
- Tree-of-Debate converts scientific papers into LLM personas that debate their respective novelties. To emphasize structured, critical reaβ¦β17Updated 4 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learningβ66Updated 6 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoningβ48Updated last year
- A trainable user simulatorβ34Updated 5 months ago
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning https://arxiv.org/abs/2501.06590β77Updated 4 months ago
- A curated list of papers on LLMs and agents for scientific research and developmentβ80Updated last year
- Code/data for MARG (multi-agent review generation)β59Updated 2 months ago
- Process Reward Models That Thinkβ63Updated 2 weeks ago
- β66Updated 6 months ago
- Data and Code for EMNLP 2025 Findings Paper "MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search"β82Updated last month
- This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studiesβ163Updated last month
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25β84Updated 5 months ago
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challengesβ24Updated 7 months ago
- Evaluate the Quality of Critiqueβ36Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)β37Updated 11 months ago
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)β33Updated 3 months ago
- [ICLR 2025] This is the official implementation for the paper: "Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluatβ¦β37Updated 6 months ago
- Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generationβ50Updated 2 months ago
- β67Updated 8 months ago
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agentβ67Updated 7 months ago
- This the implementation of LeCoβ31Updated 10 months ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.β80Updated last month