Yuan-ManX / ai-multimodal-timelineLinks
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D content. π₯
β36Updated 5 months ago
Alternatives and similar repositories for ai-multimodal-timeline
Users that are interested in ai-multimodal-timeline are comparing it to the libraries listed below
Sorting:
- Implementation of the premier Text to Video model from OpenAIβ57Updated 8 months ago
- A streamlined implementation of Grounding DINO and SAM for advanced image segmentation. This lightweight solution simplifies the integratβ¦β64Updated 9 months ago
- β16Updated last year
- Anim-Director: Controllable Animation Video Generation with Large Models-based Multimodal Agentsβ81Updated last month
- sd3 dreambooth lora training book, adapted from the diffusers docβ45Updated last year
- Community ComfyUI workflows running on fal.aiβ58Updated 10 months ago
- β13Updated last year
- β16Updated 9 months ago
- β19Updated 10 months ago
- Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.β17Updated last year
- Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.β186Updated 11 months ago
- β17Updated last year
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.β48Updated 5 months ago
- β33Updated 5 months ago
- Official PyTorch implementation of TokenSet.β121Updated 3 months ago
- β24Updated last year
- Enhancement in Multimodal Representation Learning.β40Updated last year
- β29Updated last year
- Synthetic data generator for image, video and 3D modelsβ30Updated 11 months ago
- β35Updated 2 years ago
- This is the offical page of WikiAutoGen, ICCV2025β15Updated 2 weeks ago
- β13Updated last year
- β11Updated last year
- Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".β177Updated 4 months ago
- β17Updated 2 months ago
- Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.β15Updated 6 months ago
- Gradio app to track objects in video and add visual effectsβ17Updated 2 weeks ago
- β25Updated last year
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editingβ69Updated last year
- β46Updated last year