Yuan-ManX / ai-multimodal-timelineLinks
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D content. π₯
β36Updated 4 months ago
Alternatives and similar repositories for ai-multimodal-timeline
Users that are interested in ai-multimodal-timeline are comparing it to the libraries listed below
Sorting:
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integratβ¦β64Updated 8 months ago
- Anim-Director: Controllable Animation Video Generation with Large Models-based Multimodal Agentsβ77Updated last week
- Official PyTorch implementation of TokenSet.β121Updated 3 months ago
- β71Updated 8 months ago
- β25Updated 6 months ago
- β84Updated 10 months ago
- Community ComfyUI workflows running on fal.aiβ57Updated 9 months ago
- DiT for VAE (and Video Generation)β33Updated 9 months ago
- sd3 dreambooth lora training book, adapted from the diffusers docβ45Updated last year
- Official PyTorch Implementation for Dual-Process Image Generationβ69Updated 2 weeks ago
- An official implementation of SwapAnyone.β62Updated 3 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.β48Updated 4 months ago
- Official implementation of MagicFace: Training-free Universal-Style Human Image Customized Synthesis.β63Updated 6 months ago
- TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenesβ66Updated 2 months ago
- An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community β¦β60Updated this week
- Inference-time scaling of diffusion-based image and video generation models.β151Updated 3 months ago
- Implementation of "SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing"β86Updated last year
- Collection of scripts to build small-scale datasets for fine-tuning video generation models.β62Updated 3 months ago
- β32Updated 5 months ago
- β25Updated last year
- β24Updated last year
- Pytorch implementation of MIMO, Controllable Character Video Synthesis with Spatial Decomposed Modeling, from Alibaba Intelligence Groupβ133Updated 8 months ago
- Fine-tune of Florence-2 for shot categorization.β24Updated 3 months ago
- β9Updated last year
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"β15Updated 7 months ago
- A minimalistic, hackable code base to finetune Wan video generation modelβ40Updated 2 months ago
- β17Updated 2 months ago
- π₯ Official impl. of "DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction"β111Updated last week
- β17Updated last year
- Official Implementation for paper: Negative Token Merging: Image-based Adversarial Feature Guidanceβ75Updated 4 months ago