ariG23498 / mmdpLinks
☆29Updated 2 months ago
Alternatives and similar repositories for mmdp
Users that are interested in mmdp are comparing it to the libraries listed below
Sorting:
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆96Updated 9 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated last year
- ☆59Updated last year
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆159Updated last year
- Timm model explorer☆42Updated last year
- (WACV 2025 - Oral) Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, H…☆83Updated 2 months ago
- A dashboard for exploring timm learning rate schedulers☆19Updated 10 months ago
- MatFormer repo☆62Updated 9 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆61Updated 10 months ago
- Visualize multi-model embedding spaces. The first goal is to quickly get a lay of the land of any embedding space. Then be able to scroll…☆28Updated last year
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 11 months ago
- Memory-Efficient CUDA kernels for training ConvNets with PyTorch.☆42Updated 7 months ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆62Updated 2 weeks ago
- Easily run PyTorch on multiple GPUs & machines☆47Updated 3 months ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.☆107Updated 3 weeks ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆90Updated last year
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆111Updated 3 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆69Updated last year
- Notebooks to demonstrate TimmWrapper☆16Updated 8 months ago
- Load any clip model with a standardized interface☆22Updated 2 weeks ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆52Updated 10 months ago
- Collection of autoregressive model implementation☆86Updated 5 months ago
- Implementation of Infini-Transformer in Pytorch☆113Updated 9 months ago
- ☆77Updated last month
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆53Updated 8 months ago
- Video-LlaVA fine-tune for CinePile evaluation☆51Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆99Updated last year
- Notebooks for fine tuning pali gemma☆117Updated 5 months ago
- ☆87Updated last year
- ☆79Updated 11 months ago