Yuan-ManX / ai-multimodal-timelineLinks
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D content. π₯
β36Updated 4 months ago
Alternatives and similar repositories for ai-multimodal-timeline
Users that are interested in ai-multimodal-timeline are comparing it to the libraries listed below
Sorting:
- β22Updated 5 months ago
- β13Updated last year
- The official GitHub Page for MiniMaxβ35Updated last week
- A minimalistic, hackable code base to finetune Wan video generation modelβ39Updated last month
- β19Updated 9 months ago
- β29Updated last year
- β30Updated last year
- Official implementation of MagicFace: Training-free Universal-Style Human Image Customized Synthesis.β63Updated 5 months ago
- Community ComfyUI workflows running on fal.aiβ57Updated 9 months ago
- Incredibly descriptive audiovisual summaries for videosβ41Updated 10 months ago
- β84Updated 9 months ago
- β12Updated 7 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.β48Updated 3 months ago
- β25Updated last year
- The codes of Siggraph Asia 2024 paper "Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation"β54Updated last month
- InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructionsβ128Updated last year
- Paper: "From Text to Pose to Image: Improving Diffusion Model Control and Quality"β51Updated 6 months ago
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integratβ¦β64Updated 8 months ago
- Scripts to teach Flux the task of image editing from language with the Flux Control framework.β82Updated 2 months ago
- β17Updated 4 months ago
- β25Updated last year
- β9Updated last year
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"β13Updated 6 months ago
- Fine-tune of Florence-2 for shot categorization.β24Updated 3 months ago
- DMM: Building a Versatile Image Generation Model via Distillation-Based Model Mergingβ43Updated last month
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editingβ69Updated last year
- Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!