Official repository for the paper "Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning" and the SciEvo benchmark.
☆53Jan 13, 2026Updated 2 months ago
Alternatives and similar repositories for Test-Time-Tool-Evol
Users that are interested in Test-Time-Tool-Evol are comparing it to the libraries listed below
Sorting:
- Source code for SWIFT, an efficient reward model.☆19Jan 13, 2026Updated 2 months ago
- Dynaseal is a dynamic API key management system designed to secure communications and identity verification for large model services. It …☆12Oct 30, 2024Updated last year
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆55Feb 4, 2026Updated last month
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆22Oct 16, 2025Updated 5 months ago
- FakePartsBench: 25K+ AI-generated videos with pixel- and frame-level annotations of full and partial deepfakes.☆24Aug 31, 2025Updated 6 months ago
- Official implementation of Our NeurIPS 2024 Paper "Boundary Matters: A Bi-Level Active Finetuning Method"☆14Feb 11, 2025Updated last year
- ☆18Aug 7, 2025Updated 7 months ago
- Implementation of the paper "In-context Time Series Predictor" (ICLR 2025)☆15Feb 11, 2025Updated last year
- ☆10Dec 15, 2023Updated 2 years ago
- ☆22Mar 10, 2026Updated last week
- ☆17Apr 11, 2025Updated 11 months ago
- Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.☆22Jul 18, 2025Updated 8 months ago
- ☆40Mar 3, 2026Updated 2 weeks ago
- ☆18Aug 14, 2024Updated last year
- Cell-Level RSRP Estimation with the Image-to-Image Wireless Propagation Model Based on Measured data.☆13Oct 10, 2023Updated 2 years ago
- The code is for our AAAI2023 paper: Efficient Embeddings of Logical Variables for Query Answering over Incomplete Knowledge Graphs (Ding…☆10Dec 17, 2022Updated 3 years ago
- AI Scientist by Chicago Human+AI Lab☆30Mar 12, 2026Updated last week
- ☆26Apr 27, 2025Updated 10 months ago
- Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport (Findings-ACL 2023)☆13May 4, 2023Updated 2 years ago
- PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration☆42Jan 7, 2026Updated 2 months ago
- ☆174Jan 19, 2026Updated 2 months ago
- Implementation of the paper: "FedTabDiff: Federated Learning of Diffusion Models for Synthetic Mixed-Type Tabular Data Generation"☆23Nov 10, 2024Updated last year
- AI-Driven Research Systems (ADRS)☆128Dec 17, 2025Updated 3 months ago
- Logical Message Passing Networks with One-hop Inference in Atomic Formulas (ICLR 2023)☆15Jul 21, 2023Updated 2 years ago
- ☆33Jul 15, 2025Updated 8 months ago
- This is a framework for evaluating reasoning in foundational Video Models.☆81Mar 7, 2026Updated last week
- Official repository of paper "Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models"☆23May 27, 2025Updated 9 months ago
- GraphAlign: Pretraining One Graph Neural Network on Multiple Graphs via Feature Alignment☆18Sep 17, 2024Updated last year
- Modern utility library and typescript typings for building JSON Schema documents☆14Nov 28, 2025Updated 3 months ago
- AI Agent-powered web browser: Agentic AI inside the browser.☆22Jul 13, 2025Updated 8 months ago
- ☆21Mar 13, 2026Updated last week
- A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs☆18Jan 12, 2024Updated 2 years ago
- PeRL: Parameter-Efficient Reinforcement Learning☆73Mar 10, 2026Updated last week
- ☆16Jul 4, 2025Updated 8 months ago
- A2A agent implementing OpenDeepResearch☆19Apr 14, 2025Updated 11 months ago
- single-cell RNA-seq Clustering via Deep Cut-informed Graph☆14Nov 23, 2025Updated 3 months ago
- [NeurIPS2024] Attractor memory for long-term time series forecasting: A chaos perspective☆24Nov 22, 2024Updated last year
- The first differentially-private diffusion model for tabular data☆33Jun 5, 2024Updated last year
- Easy Setup, File-based, Offline Capable Federated Learning and Computations☆22Feb 11, 2026Updated last month