chatsci / AeivaLinks
A general AI agent framework that can be adapted to various tasks and environments.
☆100Updated 4 months ago
Alternatives and similar repositories for Aeiva
Users that are interested in Aeiva are comparing it to the libraries listed below
Sorting:
- SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL☆185Updated last month
- [EMNLP 2024] DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models☆70Updated 3 weeks ago
- A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.☆45Updated last week
- [ACL 25 main] Deliberate Reasoning in Language Models as Structure-Aware Planning with an Accurate World Model☆34Updated last month
- ☆75Updated this week
- Collecting personality-indicative data for role-playing agents.☆22Updated 4 months ago
- Hybrid Latent Reasoning via Reinforcement Learning☆131Updated last month
- ☆95Updated 3 weeks ago
- [ICLR 2025] Improving Data Efficiency via Curating LLM-Driven Rating Systems☆97Updated 3 months ago
- ☆45Updated 2 months ago
- GraphRAG-Bench, the official repo of comprehensive benchmark and dataset for evaluating GraphRAG models.☆88Updated this week
- A collection of papers related to knowledge fusion☆56Updated 8 months ago
- [EMNLP 2024 Findings] Official PyTorch Implementation of "Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Ge…☆39Updated 4 months ago
- [ACL'25] Code for "Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering"☆20Updated 3 weeks ago
- We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that …☆93Updated last year
- ☆45Updated last year
- ☆109Updated last week
- ☆31Updated last year
- MPLSandbox is an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler a…☆178Updated 2 months ago
- EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in chall…☆120Updated last week
- ☆48Updated 8 months ago
- Code of Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Ne…☆26Updated last year
- [AAAI 2025] Code for paper:Enhancing Multimodal Large Language Models Complex Reasoning via Similarity Computation☆3Updated 5 months ago
- StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving☆22Updated 6 months ago
- DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding☆69Updated last month
- Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning☆75Updated 2 months ago
- ☆140Updated 3 months ago
- AutoRLAIF is a cutting-edge framework designed to revolutionize the fine-tuning of large language models through Reinforcement Learning …☆93Updated 8 months ago
- [COLING Demos 2025] an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs☆36Updated 3 months ago
- ACL 2024☆32Updated 9 months ago