haoliuhl / language-quantized-autoencoders
Language Quantized AutoEncoders
β94Updated last year
Related projects β
Alternatives and complementary repositories for language-quantized-autoencoders
- Implementation of π» Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorchβ88Updated 10 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"β63Updated 9 months ago
- Implementation of Discrete Key / Value Bottleneck, in Pytorchβ87Updated last year
- β75Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorchβ243Updated 6 months ago
- β64Updated 4 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuningβ72Updated 6 months ago
- Model Stock: All we need is just a few fine-tuned modelsβ89Updated last month
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-expertsβ108Updated 3 weeks ago
- M4 experiment logbookβ56Updated last year
- Implementation of Infini-Transformer in Pytorchβ104Updated last month
- Matryoshka Multimodal Modelsβ81Updated last month
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorchβ95Updated last year
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Modelsβ56Updated 2 weeks ago
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"β36Updated last year
- Patching open-vocabulary models by interpolating weightsβ90Updated last year
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.β32Updated last year
- VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioningβ86Updated last month
- Holistic evaluation of multimodal foundation modelsβ41Updated 3 months ago
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learningβ124Updated 2 years ago
- Code and data for the benchmark "Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Lanβ¦β34Updated 4 months ago
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".β42Updated 2 weeks ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Modelsβ41Updated 4 months ago
- β77Updated 3 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.β57Updated 5 months ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorchβ97Updated last year
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.orβ¦β107Updated 4 months ago
- β65Updated 8 months ago
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)β53Updated 3 months ago
- β45Updated last year