[NeurIPS'25 Spotlight] Official implementation of "JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation"
β69Feb 26, 2026Updated last month
Alternatives and similar repositories for JavisGPT
Users that are interested in JavisGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Animate Any Character in Any Worldβ97Mar 10, 2026Updated 2 weeks ago
- [CVPR 2026] π Dataset and Benchmark code for EgoEditβ125Mar 13, 2026Updated last week
- [CVPR 2026] SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Timeβ102Jan 1, 2026Updated 2 months ago
- [AAAI 2026] UltraGenβ77Feb 1, 2026Updated last month
- [ICLR 2026] Light-X: Generative 4D Video Rendering with Camera and Illumination Controlβ173Dec 11, 2025Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- β86Feb 4, 2026Updated last month
- [CVPR 2026π₯] Enhancing Spatial Understanding in Image Generation via Reward Modelingβ79Mar 2, 2026Updated 3 weeks ago
- DreamStyle: A Unified Framework for Video Stylizationβ115Jan 7, 2026Updated 2 months ago
- Official repo for paper "IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning"β41Jan 29, 2026Updated last month
- Resilient multi-LLM orchestration with in-built failure handing, rate limits, retries, and circuit breaker.β30Mar 20, 2026Updated last week
- Official repository of paper "ProEdit: Inversion-based Editing From Prompts Done Right"β116Feb 5, 2026Updated last month
- Official code for SongEchoβ53Mar 3, 2026Updated 3 weeks ago
- A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using β¦β313Dec 15, 2025Updated 3 months ago
- SpotEdit:Selective Region Editing in Diffusion Transformersβ176Jan 5, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code for paper "CLiFT: Compressive Light-Field Tokens for Compute Efficient and Adaptive Neural Rendering" [NeurIPS 2025 (spotlight)]β75Aug 2, 2025Updated 7 months ago
- D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI [ICLR 2026]β76Mar 3, 2026Updated 3 weeks ago
- Audio-video joint generationβ56Nov 27, 2025Updated 4 months ago
- A Unified Visual Generator with Interleaved OmniModal Contextβ203Mar 5, 2026Updated 3 weeks ago
- Official repository for the paper "MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars"β41Nov 20, 2025Updated 4 months ago
- β324Jan 24, 2026Updated 2 months ago
- [ICLR2026] Any-to-Bokeh is a novel one-step video bokeh framework that converts arbitrary input videos into temporally coherent, depth-awβ¦β129Feb 4, 2026Updated last month
- DreamID-V: Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformerβ583Mar 13, 2026Updated 2 weeks ago
- β17Sep 23, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Scaling Zero-Shot Reference-to-Video Generationβ64Dec 11, 2025Updated 3 months ago
- Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".β138Dec 18, 2025Updated 3 months ago
- [CVPR 2026] OmniTransfer: All-in-one Framework for Spatio-temporal Video Transferβ224Feb 21, 2026Updated last month
- We propose a novel modular framework that learns to dynamically mix low-rank adapters (LoRAs) to improve visual analogy learning, enablinβ¦β71Feb 18, 2026Updated last month
- End2End Virtual Try-on with Visual Reference, CVPR2026β58Mar 17, 2026Updated last week
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engineβ42Jan 17, 2025Updated last year
- This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)β25Dec 7, 2023Updated 2 years ago
- Implementation of <Streaming Autoregressive Video Generation via Diagonal Distillation> in ICLR 2026β107Mar 18, 2026Updated last week
- β189Mar 11, 2026Updated 2 weeks ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official codes for the paper "GARDO: Reinforcing Diffusion Models without Reward Hacking"β55Feb 2, 2026Updated last month
- β29May 7, 2025Updated 10 months ago
- Official implementation of Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Modelβ245Dec 8, 2025Updated 3 months ago
- [CVPR 2024] This repository includes the official implementation our paper "Revisiting Adversarial Training at Scale"β20Apr 21, 2024Updated last year
- β104Dec 28, 2025Updated 2 months ago
- [CVPR 2026] Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPOβ102Feb 28, 2026Updated 3 weeks ago
- Multimodal OCR: Parse Anything from Documentsβ80Mar 20, 2026Updated last week