198808xc / Vision-AGI-SurveyView external linksLinks
A temporary webpage for our survey in AGI for computer vision
☆119May 4, 2024Updated last year
Alternatives and similar repositories for Vision-AGI-Survey
Users that are interested in Vision-AGI-Survey are comparing it to the libraries listed below
Sorting:
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆13May 13, 2023Updated 2 years ago
- pytorch大规模数据读取dataset☆13May 30, 2022Updated 3 years ago
- Code for ICCV 2023 paper ✨ "StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Mo…☆18Jan 25, 2024Updated 2 years ago
- ☆20Jul 1, 2025Updated 7 months ago
- ☆19Apr 16, 2025Updated 10 months ago
- ICCV2023-Diffusion-Papers☆108Sep 3, 2023Updated 2 years ago
- Awesome Lists for Tenure-Track Assistant Professors and PhD students. (助理教授/博士生生存指南)☆1,619Feb 1, 2024Updated 2 years ago
- Neuroscience Inspired Agent Reasoning Framework☆28May 19, 2025Updated 8 months ago
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆503Sep 2, 2024Updated last year
- Video Diffusion State Space Models☆19Mar 27, 2024Updated last year
- ☆21Jan 17, 2025Updated last year
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆55Aug 16, 2024Updated last year
- (TPAMI 2024) A Survey on Open Vocabulary Learning☆987Dec 24, 2025Updated last month
- Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving☆32Nov 20, 2025Updated 2 months ago
- LoRAT_pytracking: reproduction of [ECCV2024] LoRAT☆46Dec 9, 2024Updated last year
- ☆547Nov 7, 2024Updated last year
- [ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".☆27Oct 13, 2024Updated last year
- CogVideoX-5B-I2V ori model lora train.☆27Dec 27, 2024Updated last year
- ☆23Jul 5, 2024Updated last year
- ☆27Apr 11, 2025Updated 10 months ago
- Which fellows cited my article?☆24Mar 6, 2022Updated 3 years ago
- ☆27Jul 20, 2024Updated last year
- General AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, AnyX☆1,842Nov 15, 2023Updated 2 years ago
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆60May 2, 2025Updated 9 months ago
- Recent LLM-based CV and related works. Welcome to comment/contribute!☆873Mar 8, 2025Updated 11 months ago
- The official code repository for AAAI 2021 paper CPCGAN: A Controllable 3D Point Cloud Generative Adversarial Network with Semantic Label…☆26Jan 9, 2022Updated 4 years ago
- Consistent Autoregressive Video Generation with Long Context☆60Feb 6, 2026Updated last week
- Painter & SegGPT Series: Vision Foundation Models from BAAI☆2,592Dec 6, 2024Updated last year
- Code for "DreamEdit: Subject-driven Image Editing" (TMLR2023)☆109Jan 23, 2024Updated 2 years ago
- Lion: Kindling Vision Intelligence within Large Language Models☆51Jan 25, 2024Updated 2 years ago
- Reproduction of the first step in the text-to-video model Phenaki. Code and model weights for the Transformer-based autoencoder for video…☆29Aug 4, 2023Updated 2 years ago
- Unofficial implementation of Layer Diffuse in diffusers☆27Apr 3, 2024Updated last year
- (NeurIPS 2022) Self-Supervised Visual Representation Learning with Semantic Grouping☆97Mar 10, 2025Updated 11 months ago
- Official PyTorch implementation of CVPRW 2022 paper "Attention Consistency on Visual Corruptions for Single-Source Domain Generalization"☆29Feb 22, 2023Updated 2 years ago
- [TMLR 2025🔥] A survey for the autoregressive models in vision.☆786Nov 8, 2025Updated 3 months ago
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆477Jan 17, 2025Updated last year
- [CVPR 2024] ViT-Lens: Towards Omni-modal Representations☆190Feb 3, 2025Updated last year
- [NeurIPS 2024] Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic Segmentation☆36May 22, 2025Updated 8 months ago
- Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)☆34Mar 24, 2025Updated 10 months ago