FudanCVL / OmniAVSLinks
[ICCV 2025] Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
☆32Updated 3 weeks ago
Alternatives and similar repositories for OmniAVS
Users that are interested in OmniAVS are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆71Updated 11 months ago
- ☆43Updated last year
- [NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆70Updated last month
- ICML2025☆58Updated last month
- ☆28Updated last year
- Transactions on Multimedia (TMM25)☆16Updated 6 months ago
- [ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/To…☆142Updated 2 months ago
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆50Updated 10 months ago
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆34Updated 11 months ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆17Updated last year
- ☆37Updated 3 months ago
- Official implementation of "STAR: Scale-wise Text-to-image generation via Auto-Regressive representations"☆38Updated 7 months ago
- This is the official implementation for ControlVAR.☆122Updated 10 months ago
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Atten…☆57Updated 3 months ago
- [NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis☆78Updated this week
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆13Updated last year
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆19Updated last year
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆13Updated last year
- Official implementation for "Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter"☆47Updated 3 weeks ago
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆47Updated last week
- WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction☆52Updated last month
- Code for "How far can we go with ImageNet for Text-to-Image generation?" paper☆93Updated 2 months ago
- ☆129Updated this week
- Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer☆104Updated this week
- Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models☆19Updated 4 months ago
- [NeurIPS 2024] COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing☆24Updated 10 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆222Updated 2 months ago
- The official repository of our paper "Reinforcing Video Reasoning with Focused Thinking"☆26Updated 4 months ago
- ☆32Updated 2 weeks ago
- ☆46Updated 5 months ago