fudan-zvg / S-Agents
Official repository of S-Agents: Self-organizing Agents in Open-ended Environment
☆21Updated 10 months ago
Alternatives and similar repositories for S-Agents:
Users that are interested in S-Agents are comparing it to the libraries listed below
- ☆32Updated last month
- Official implementation of "Self-Improving Video Generation"☆58Updated last month
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆32Updated 2 months ago
- [NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simu…☆77Updated this week
- ☆48Updated 4 months ago
- [ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark☆66Updated last week
- ElasticTok: Adaptive Tokenization for Image and Video☆49Updated 2 months ago
- ☆24Updated 7 months ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆19Updated last month
- [NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding☆62Updated 3 weeks ago
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆36Updated last month
- ☆68Updated 6 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆41Updated this week
- This repository is a collection of research papers on World Models.☆37Updated last year
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆46Updated last month
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆81Updated 3 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆39Updated 2 months ago
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆118Updated 5 months ago
- ☆22Updated last month
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆58Updated 6 months ago
- ☆13Updated 2 months ago
- [ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆35Updated last year
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆59Updated 4 months ago
- ☆33Updated last year
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation☆32Updated 4 months ago
- ☆34Updated 3 weeks ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Updated 7 months ago
- VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation☆85Updated 4 months ago