yangbang18 / ZeroNLGView external linksLinks
(TPAMI'2024) ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
☆22Aug 8, 2024Updated last year
Alternatives and similar repositories for ZeroNLG
Users that are interested in ZeroNLG are comparing it to the libraries listed below
Sorting:
- [npj Digital Medicine] A multimodal multidomain multilingual medical foundation model for zero shot clinical diagnosis☆17Feb 6, 2025Updated last year
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Jun 15, 2024Updated last year
- ☆13Jul 10, 2024Updated last year
- Code for PromptNet☆16Jan 29, 2025Updated last year
- [EMNLP2024] Benchmark for "Large Language Models Are Poor Clinical Decision-Makers: A Comprehensive Benchmark"☆35Sep 18, 2025Updated 4 months ago
- ☆15Mar 30, 2025Updated 10 months ago
- GPT-4V(ision) as A Social Media Analysis Engine☆38Dec 20, 2024Updated last year
- ☆27Mar 3, 2025Updated 11 months ago
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆23Jul 30, 2025Updated 6 months ago
- [CVPR 2025] GPS as a Control Signal for Image Generation☆25Mar 18, 2025Updated 10 months ago
- Video Diffusion Transformers are In-Context Learners☆36Jan 6, 2025Updated last year
- Code for Paper 'Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach'☆34Jan 2, 2026Updated last month
- Reward Guided Latent Consistency Distillation☆26Oct 9, 2024Updated last year
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Oct 13, 2022Updated 3 years ago
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆26May 26, 2025Updated 8 months ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆35Mar 12, 2024Updated last year
- [NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning☆64Jan 6, 2026Updated last month
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆36Aug 8, 2024Updated last year
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Mar 4, 2024Updated last year
- ☆43May 30, 2025Updated 8 months ago
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆15Sep 7, 2024Updated last year
- A Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime☆13Dec 7, 2024Updated last year
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆29Sep 12, 2025Updated 5 months ago
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆48Apr 10, 2025Updated 10 months ago
- ☆47Apr 20, 2025Updated 9 months ago
- [NAACL 2024] LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-text Generation?☆43Jun 9, 2024Updated last year
- Simple Telegram bot to annotate and varify automatic speech recognition datasets☆12Mar 30, 2021Updated 4 years ago
- ☆17May 15, 2025Updated 8 months ago
- [ICCV 2023] Code for "Multi-task View Synthesis with Neural Radiance Fields"☆11Oct 2, 2023Updated 2 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- [AAAI 2025] Official pytorch implementation of "Diffusion Model Patching via Mixture-of-Prompts"☆13Dec 12, 2024Updated last year
- A PyTorch implementation of the paper "MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis".☆12Jan 16, 2023Updated 3 years ago
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"☆37Oct 9, 2025Updated 4 months ago
- PyTorch Implementation for InMaP☆11Oct 28, 2023Updated 2 years ago
- ☆11Nov 30, 2025Updated 2 months ago
- ☆14May 20, 2025Updated 8 months ago
- [ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".☆13Jan 25, 2025Updated last year