kyegomez / qformer
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
☆36Updated 4 months ago
Alternatives and similar repositories for qformer:
Users that are interested in qformer are comparing it to the libraries listed below
- LMM solved catastrophic forgetting, AAAI2025☆40Updated 4 months ago
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆26Updated 2 months ago
- Keras implement of Finite Scalar Quantization☆71Updated last year
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆35Updated 9 months ago
- OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Rea…☆40Updated 2 weeks ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆21Updated 7 months ago
- Triton implement of bi-directional (non-causal) linear attention☆44Updated last month
- MIO: A Foundation Model on Multimodal Tokens☆22Updated 3 months ago
- Open-Pandora: On-the-fly Control Video Generation☆32Updated 4 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆37Updated 5 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 4 months ago
- [ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling☆79Updated 11 months ago
- Official PyTorch implementation of EMOVA in CVPR 2025 (https://arxiv.org/abs/2409.18042)☆23Updated 2 weeks ago
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆45Updated 4 months ago
- ☆17Updated 2 months ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆60Updated last year
- PyTorch implementation of StableMask (ICML'24)☆12Updated 9 months ago
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆44Updated 3 months ago
- ☆23Updated 5 months ago
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆55Updated 8 months ago
- ☆30Updated 10 months ago
- ☆37Updated last week
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆131Updated 4 months ago
- The official implementation of MAGVLT: Masked Generative Vision-and-Language Transformer (CVPR'23)☆26Updated last year
- ☆49Updated last year
- ☆11Updated 3 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆33Updated 2 months ago
- ☆73Updated last year
- Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context☆28Updated 7 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆24Updated last week