VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection
☆25May 31, 2025Updated 9 months ago
Alternatives and similar repositories for VisTA
Users that are interested in VisTA are comparing it to the libraries listed below
Sorting:
- Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text …☆14Nov 20, 2025Updated 3 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆57Oct 28, 2024Updated last year
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆45Jul 22, 2025Updated 7 months ago
- Code for ICLR'24 workshop ME-FoMo-How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation☆38Oct 18, 2024Updated last year
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆38Jan 25, 2024Updated 2 years ago
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- Automatic stabilizing and auto-piloting system for RC flying wing☆14Mar 3, 2016Updated 10 years ago
- A large scale inpainting & t2i anime image dataset☆14Oct 18, 2025Updated 4 months ago
- Directed masked autoencoders☆14Feb 20, 2026Updated last week
- Official repository for the paper "On the use of Benford's law to detect GAN-generated images", ICPR2020☆13Apr 7, 2021Updated 4 years ago
- ☆10Apr 7, 2025Updated 10 months ago
- ☆25Aug 19, 2025Updated 6 months ago
- OpenSRH is the first ever publicly available stimulated Raman histology (SRH) dataset and benchmark, which will facilitate the clinical t…☆13Oct 13, 2022Updated 3 years ago
- 2024年第六届全球校园人工智能算法精英大赛AI生成人脸图像鉴别☆15May 30, 2025Updated 9 months ago
- ☆21Aug 8, 2025Updated 6 months ago
- 💀 gigasmol: a lightweight wrapper for gigachat api model for seamless use with smolagents.☆15Oct 23, 2025Updated 4 months ago
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated last month
- Perception Matters: Exploring Imperceptible and Transferable Anti-forensics for GAN-generated Fake Face Imagery Detection☆11Jan 23, 2023Updated 3 years ago
- Code for paper: Reinforced Vision Perception with Tools☆71Oct 3, 2025Updated 5 months ago
- MMM 2021: Crossed-Time Delay Neural Network for Speaker Recognition☆11Dec 4, 2021Updated 4 years ago
- [AAAI 2025 Oral] ODDN: Addressing Unpaired Data Challenges in Open-World Deepfake Detection on Online Social Networks https://arxiv.org/…☆10Jun 25, 2025Updated 8 months ago
- This repo has scripts to compare various powerful RL methods☆33Feb 23, 2026Updated last week
- [ECCV 2024] Official Code for our paper "Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation"…☆12Mar 24, 2025Updated 11 months ago
- ☆12Jul 24, 2024Updated last year
- [JMLR] Gradual Domain Adaptation: Theory and Algorithms☆11Jan 14, 2025Updated last year
- The GPT-4 function calls used in everchanging quest for the HF game jam☆10Jul 9, 2023Updated 2 years ago
- Python client to integrate Cleanlab Codex with your AI Agent☆19Nov 19, 2025Updated 3 months ago
- [IJCAI'25 Workshop Oral] The 1st place solution of IJCAI 2025 challenge track 1: Image Detection and Localization☆34Dec 4, 2025Updated 3 months ago
- Official code for the paper "Adversarial Magnification to Deceive Deepfake Detection through Super Resolution"☆12Jun 26, 2023Updated 2 years ago
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆24Jun 8, 2025Updated 8 months ago
- Wave - The Software as a Service Starter Kit, designed to help you build the SAAS of your dreams 🚀 💰☆12Jan 30, 2026Updated last month
- Risky Object Localization (ROL) in a Driving Scene Dataset☆15Dec 24, 2023Updated 2 years ago
- Code accompanying the 2022 DLS paper "Misleading Deep-Fake Detection with GAN Fingerprints"☆10May 26, 2022Updated 3 years ago
- mouse pet-ct image segmentation☆12Feb 19, 2023Updated 3 years ago
- ☆11Apr 3, 2023Updated 2 years ago
- Official repository of "Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection" [ICCV 2025]☆20Jan 17, 2026Updated last month
- ☆28Jan 5, 2026Updated 2 months ago
- Surrogate Modeling of the Aerodynamic Performance for Transonic Regime☆13Feb 12, 2024Updated 2 years ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago