kyegomez / qformer
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
☆31Updated last week
Related projects ⓘ
Alternatives and complementary repositories for qformer
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆24Updated last week
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆32Updated last month
- Official implementation for the paper "A Cheaper and Better Diffusion Language Model with Soft-Masked Noise"☆52Updated last year
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆17Updated 3 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆23Updated last week
- Keras implement of Finite Scalar Quantization☆63Updated last year
- The official implementation of MAGVLT: Masked Generative Vision-and-Language Transformer (CVPR'23)☆26Updated 10 months ago
- ☆35Updated 5 months ago
- ☆25Updated last week
- Source code for EMNLP2022 long paper: Parameter-Efficient Tuning Makes a Good Classification Head☆13Updated 2 years ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆47Updated 3 months ago
- The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"☆22Updated 8 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 9 months ago
- Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆40Updated 4 months ago
- PyTorch implementation of StableMask (ICML'24)☆12Updated 4 months ago
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆42Updated 3 weeks ago
- [ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling☆74Updated 6 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆14Updated 2 weeks ago
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Updated last week
- LMM which strictly superset LLM embedded☆30Updated 2 weeks ago
- Data-Efficient Multimodal Fusion on a Single GPU☆47Updated 6 months ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆29Updated last year
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆58Updated 3 weeks ago
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆16Updated last week
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆55Updated last month
- ☆22Updated 3 months ago
- Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image …☆55Updated last month
- ☆45Updated last year
- [IJCAI'23] The official Github page of the paper "Diffusion Models for Non-autoregressive Text Generation: A Survey".☆48Updated 5 months ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆48Updated last year