haoliuhl / language-quantized-autoencodersLinks
Language Quantized AutoEncoders
β107Updated 2 years ago
Alternatives and similar repositories for language-quantized-autoencoders
Users that are interested in language-quantized-autoencoders are comparing it to the libraries listed below
Sorting:
- Implementation of π» Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorchβ89Updated last year
- β120Updated 2 years ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuningβ86Updated last year
- https://arxiv.org/abs/2209.15162β50Updated 2 years ago
- Matryoshka Multimodal Modelsβ110Updated 5 months ago
- A PyTorch implementation of Multimodal Few-Shot Learning with Frozen Language Models with OPT.β43Updated 2 years ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"β80Updated last year
- ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relyβ¦β51Updated last year
- β50Updated last year
- β129Updated 2 years ago
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)β63Updated 10 months ago
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".β58Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Modelsβ44Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuningβ31Updated 2 years ago
- Toolkit for Elevater Benchmarkβ72Updated last year
- Patching open-vocabulary models by interpolating weightsβ91Updated last year
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"β74Updated 7 months ago
- MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuningβ135Updated 2 years ago
- Compress conventional Vision-Language Pre-training dataβ51Updated last year
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"β37Updated last year
- Preference Learning for LLaVAβ46Updated 7 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.orβ¦β127Updated 11 months ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".β78Updated 2 years ago
- [NeurIPS 2023] A faithful benchmark for vision-language compositionalityβ80Updated last year
- β75Updated 11 months ago
- PyTorch implementation of LIMoEβ53Updated last year
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.β68Updated last year
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)β34Updated 2 years ago
- β54Updated 2 years ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding stratβ¦β78Updated 4 months ago