AstraZeneca / vlmLinks
Official implementation for "Diffusion Instruction Tuning"
☆30Updated 4 months ago
Alternatives and similar repositories for vlm
Users that are interested in vlm are comparing it to the libraries listed below
Sorting:
- The official repo for LIFT: Language-Image Alignment with Fixed Text Encoders☆36Updated 4 months ago
- [NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis☆78Updated this week
- ☆55Updated 5 months ago
- Official Pytorch implementation of "Vision Transformers Don't Need Trained Registers" (NeurIPS '25 Spotlight)☆115Updated last month
- the official repo for "D-AR: Diffusion via Autoregressive Models"☆117Updated 3 months ago
- ☆70Updated 11 months ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆98Updated last year
- Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding☆158Updated 3 months ago
- Autoregressive Image Generation with Randomized Parallel Decoding☆77Updated 6 months ago
- Implementation for "The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer"☆65Updated last month
- Official respository for ReasonGen-R1☆68Updated 3 months ago
- Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)☆40Updated 3 weeks ago
- [NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆70Updated 3 weeks ago
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis☆122Updated 5 months ago
- [Preprint] UCGM: Unified Continuous Generative Models☆168Updated 4 months ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆24Updated last year
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆29Updated 7 months ago
- ☆60Updated 3 months ago
- ☆129Updated this week
- Code for the paper DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents, ICML 2024☆89Updated last year
- [NeurIPS 2025 Oral] Exploring Diffusion Transformer Designs via Grafting☆60Updated 4 months ago
- FaceXBench: Evaluating Multimodal LLMs on Face Understanding☆15Updated 8 months ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆60Updated 2 months ago
- ☆72Updated 2 months ago
- Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"☆140Updated 9 months ago
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆125Updated last year
- Code release for "SegLLM: Multi-round Reasoning Segmentation"☆117Updated 7 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆46Updated 11 months ago
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Updated 11 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆39Updated 8 months ago