showlab / Show-Anything-3D
Edit and Generate Anything in 3D world!
☆13Updated 2 years ago
Alternatives and similar repositories for Show-Anything-3D:
Users that are interested in Show-Anything-3D are comparing it to the libraries listed below
- An interactive demo based on Segment-Anything for stroke-based painting which enables human-like painting.☆35Updated 2 years ago
- A curated list of papers and resources for text-to-image evaluation.☆29Updated last year
- The official repository for CVPRW2024 paper "What’s in a Name? Beyond Class Indices for Image Recognition"☆12Updated 7 months ago
- DDS: Delta Denoising Score PyTorch implementation☆18Updated last year
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 10 months ago
- Democratising RGBA Image Generation With No $$$ (AI4VA@ECCV24)☆26Updated 7 months ago
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆18Updated last week
- Official implementation of "VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis"☆19Updated 3 months ago
- Code for paper <Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation> in ICCV 2021.☆13Updated 3 years ago
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆28Updated 11 months ago
- ☆10Updated 9 months ago
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"☆14Updated 5 months ago
- ICME2022 Special Session “Beyond Accuracy: Responsible, Responsive, and Robust Multimedia Retrieval ”☆12Updated 10 months ago
- ☆19Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆26Updated last year
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆34Updated last year
- [ICCV 2021] Click to Move: Controlling Video Generation with Sparse Motion☆11Updated 2 years ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 8 months ago
- ☆20Updated last month
- ScaleNet: Searching for the Model to Scale (ECCV 2022)☆12Updated 2 years ago
- [ICCV 2023] Label-Efficient Online Continual Object Detection in Streaming Video☆19Updated last year
- A visual LLM for image region description or QA.☆15Updated last year
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆27Updated last year
- [CVPR 2025] DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles☆22Updated last month
- A curated list of Text-to-Video Generation papers and BibTeX entries☆18Updated last year
- Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning☆20Updated 3 weeks ago
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆33Updated 2 years ago
- ☆14Updated 2 years ago
- ☆21Updated last year
- ☆22Updated 10 months ago