This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Generation
☆17Jul 1, 2024Updated last year
Alternatives and similar repositories for vLMIG
Users that are interested in vLMIG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repo of the paper "StressTest: Can YOUR Speech LM Handle the Stress?"☆20Jul 9, 2025Updated 8 months ago
- Official implementation of "Dataset Size Recovery from LoRA Weights" paper.☆34Jun 30, 2024Updated last year
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆20Jan 3, 2023Updated 3 years ago
- ☆19Jan 8, 2025Updated last year
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆88Jun 18, 2024Updated last year
- [AAAI 2025] Official Implementation for "Click2Mask: Local Editing with Dynamic Mask Generation" Paper.☆20Jan 22, 2026Updated 2 months ago
- This repo contains the official PyTorch implementation of "A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement" (…☆28Aug 8, 2022Updated 3 years ago
- Code for our paper: "Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval".☆15Feb 26, 2025Updated last year
- ☆16Sep 6, 2024Updated last year
- Official PyTorch Implementation for the "Recovering the Pre-Fine-Tuning Weights of Generative Models" paper (ICML 2024).☆85Apr 15, 2025Updated 11 months ago
- Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730☆131Dec 8, 2023Updated 2 years ago
- An official implementation of ProbeGen☆13Oct 20, 2024Updated last year
- TEAL: New Selection Strategy for Small Buffers in Experience Replay Class Incremental Learning☆17Jan 21, 2025Updated last year
- Official This-Is-My Dataset published in CVPR 2023☆16Jul 18, 2024Updated last year
- Official PyTorch Implementation for the "Unsupervised Model Tree Heritage Recovery" paper (ICLR 2025).☆63Jul 1, 2025Updated 8 months ago
- Official implementation of "DGD: Dynamic 3D Gaussians Distillation".☆69Aug 16, 2024Updated last year
- Automatic Measurement of Vowel Duration for Consonant Vowel Consonant (CVC) sound files (JASA 2016)☆14Feb 25, 2017Updated 9 years ago
- ☆16Jun 14, 2024Updated last year
- This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)☆239May 1, 2025Updated 10 months ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptati…☆127Feb 13, 2025Updated last year
- An official PyTorch implementation for CLIPPR☆30Jul 22, 2023Updated 2 years ago
- Code for our papers : "Generating images of rare concepts using pre-trained diffusion models" (AAAI 24) and "Norm-guided latent space exp…☆87Dec 27, 2023Updated 2 years ago
- Official implemention of "Make It Count: Text-to-Image Generation with an Accurate Number of Objects" (CVPR 2025)☆97Mar 12, 2025Updated last year
- ☆16Jul 23, 2024Updated last year
- ☆12Apr 21, 2025Updated 11 months ago
- rmp data ranking☆13Nov 4, 2025Updated 4 months ago
- DSTC8-AVSD: Sentence generation task for Audio Visual Scene-aware Dialog☆14Jun 10, 2021Updated 4 years ago
- TREE-G: Decision Trees Contesting Graph Neural Networks, specialized for graph data.☆13Feb 28, 2024Updated 2 years ago
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Aug 13, 2024Updated last year
- ☆16Dec 22, 2017Updated 8 years ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆86Jul 16, 2024Updated last year
- Real-Time Deepfake Detection in the Real-World☆47Nov 30, 2024Updated last year
- [MM 2024 Oral] Refiner for AIGC☆29Jul 29, 2024Updated last year
- ☆18Jun 10, 2024Updated last year
- The code base for the article "Neural Structural Correspondence Learning for Domain Adaptation", CoNLL 2017☆14Jun 20, 2018Updated 7 years ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated 2 years ago
- The official repo of continuous speculative decoding☆32Mar 28, 2025Updated 11 months ago
- Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"☆28Feb 22, 2022Updated 4 years ago