fudan-zvg / S-AgentsLinks
Official repository of S-Agents: Self-organizing Agents in Open-ended Environment
☆26Updated last year
Alternatives and similar repositories for S-Agents
Users that are interested in S-Agents are comparing it to the libraries listed below
Sorting:
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆34Updated last year
- ☆33Updated 2 years ago
- Code for “Pretrained Language Models as Visual Planners for Human Assistance”☆61Updated 2 years ago
- ☆38Updated last year
- Multimodal RewardBench☆42Updated 4 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆46Updated 4 months ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆37Updated last year
- [IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simula…☆91Updated last month
- ☆45Updated 6 months ago
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆105Updated last year
- This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR2025]☆71Updated 2 weeks ago
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆45Updated 2 weeks ago
- Official implementation of "Self-Improving Video Generation"☆67Updated 2 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆89Updated last month
- ☆50Updated last year
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆35Updated 8 months ago
- ☆36Updated last year
- ☆48Updated last month
- Official repository for the General Robust Image Task (GRIT) Benchmark☆54Updated 2 years ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆27Updated 2 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆59Updated 6 months ago
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆36Updated last year
- ☆29Updated last year
- Official Code for ACL 2023 Outstanding Paper: World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Languag…☆32Updated last year
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆14Updated last month
- [NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos☆40Updated last month
- Recursive Visual Programming (ECCV 2024)☆17Updated 7 months ago
- Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"☆84Updated last month
- ☆64Updated 3 weeks ago
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆65Updated last year