Official repository for the paper "Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning" and the SciEvo benchmark.
☆44Jan 13, 2026Updated 5 months ago
Alternatives and similar repositories for Test-Time-Tool-Evol
Users that are interested in Test-Time-Tool-Evol are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆38Nov 11, 2025Updated 7 months ago
- Source code for SWIFT, an efficient reward model.☆21Jan 13, 2026Updated 5 months ago
- Dynaseal is a dynamic API key management system designed to secure communications and identity verification for large model services. It …☆12Oct 30, 2024Updated last year
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆22Oct 16, 2025Updated 8 months ago
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆65Feb 4, 2026Updated 4 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Advantage Alignment Algorithms (ICLR 2025 oral)☆20Apr 7, 2025Updated last year
- ☆54Mar 8, 2026Updated 3 months ago
- ☆18Aug 7, 2025Updated 10 months ago
- Time-RA: Towards Time Series Reasoning for Anomaly with LLM Feedback☆23Jan 10, 2026Updated 5 months ago
- Implementation of the paper "In-context Time Series Predictor" (ICLR 2025)☆16Feb 11, 2025Updated last year
- [CVPR 2026] MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent☆34Apr 30, 2026Updated 2 months ago
- [RA-L 2026] Official code repository for "CLARE: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and …☆43Apr 11, 2026Updated 2 months ago
- The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.☆50Apr 22, 2026Updated 2 months ago
- ☆38Nov 15, 2025Updated 7 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- FakePartsBench: 25K+ AI-generated videos with pixel- and frame-level annotations of full and partial deepfakes.☆25May 29, 2026Updated last month
- The src for Paper "Frequency-aware Generative Models for Multivariate Time Series Imputation"☆16May 22, 2024Updated 2 years ago
- Official implementation of 'All in One and One for All: A Simple yet Effective Method towards Cross-domain Graph Pretraining' published i…☆47Oct 23, 2024Updated last year
- ☆17Apr 11, 2025Updated last year
- Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.☆21Jul 18, 2025Updated 11 months ago
- ☆18Aug 14, 2024Updated last year
- ☆21Apr 15, 2025Updated last year
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated 4 months ago
- The code is for our AAAI2023 paper: Efficient Embeddings of Logical Variables for Query Answering over Incomplete Knowledge Graphs (Ding…☆10Dec 17, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆20Dec 30, 2025Updated 6 months ago
- Experimental interface environment for open source LLM, designed to democratize the use of AI. Powered by llama-cpp, llama-cpp-python and…☆18Oct 11, 2025Updated 8 months ago
- ☆57Mar 3, 2026Updated 3 months ago
- Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport (Findings-ACL 2023)☆13May 4, 2023Updated 3 years ago
- the code of MoG☆22Aug 6, 2024Updated last year
- AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents☆99May 5, 2026Updated last month
- Implementation of the paper: "FedTabDiff: Federated Learning of Diffusion Models for Synthetic Mixed-Type Tabular Data Generation"☆23Nov 10, 2024Updated last year
- Logical Message Passing Networks with One-hop Inference in Atomic Formulas (ICLR 2023)☆15Jul 21, 2023Updated 2 years ago
- This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.☆17Sep 13, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Implementation of Half-Hop☆20Oct 10, 2023Updated 2 years ago
- ☆34Jul 15, 2025Updated 11 months ago
- ☆14Oct 29, 2020Updated 5 years ago
- Official repository of paper "Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models"☆26May 27, 2025Updated last year
- GraphAlign: Pretraining One Graph Neural Network on Multiple Graphs via Feature Alignment☆18Sep 17, 2024Updated last year
- AI Training Chip☆13Jan 4, 2022Updated 4 years ago
- A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs☆18Jan 12, 2024Updated 2 years ago