kyegomez / qformerLinks
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
☆42Updated 9 months ago
Alternatives and similar repositories for qformer
Users that are interested in qformer are comparing it to the libraries listed below
Sorting:
- Keras implement of Finite Scalar Quantization☆79Updated last year
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆26Updated 6 months ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Updated last year
- LMM solved catastrophic forgetting, AAAI2025☆44Updated 3 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆44Updated last year
- PyTorch implementation of StableMask (ICML'24)☆13Updated last year
- A repository for DenseSSMs☆88Updated last year
- Official implementation for the paper "A Cheaper and Better Diffusion Language Model with Soft-Masked Noise"☆58Updated last year
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆39Updated 10 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆42Updated 4 months ago
- Triton implement of bi-directional (non-causal) linear attention☆51Updated 6 months ago
- ☆50Updated last year
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆25Updated 3 weeks ago
- [ICLR 2023] Official implementation of Transnormer in our ICLR 2023 paper - Toeplitz Neural Network for Sequence Modeling☆79Updated last year
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆37Updated last year
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆124Updated last year
- ☆44Updated 2 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆74Updated 8 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆86Updated 8 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆102Updated last year
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆70Updated 9 months ago
- ☆132Updated last year
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆61Updated 2 years ago
- [ICCV'25] Explore the Limits of Omni-modal Pretraining at Scale☆114Updated 11 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆35Updated last year
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆20Updated 8 months ago
- ☆33Updated 2 months ago
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆209Updated 7 months ago
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆48Updated last month
- Official PyTorch implementation of EMOVA in CVPR 2025 (https://arxiv.org/abs/2409.18042)☆61Updated 4 months ago