ariG23498 / mmdpLinks
☆33Updated 5 months ago
Alternatives and similar repositories for mmdp
Users that are interested in mmdp are comparing it to the libraries listed below
Sorting:
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆160Updated last year
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Updated last year
- (WACV 2025 - Oral) Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, H…☆84Updated 4 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆97Updated last year
- ☆48Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆102Updated last year
- DPO, but faster 🚀☆46Updated last year
- Load any clip model with a standardized interface☆22Updated 2 months ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆103Updated 2 years ago
- ☆87Updated last year
- ☆59Updated last year
- ☆52Updated last year
- A dashboard for exploring timm learning rate schedulers☆19Updated last year
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆54Updated 11 months ago
- Timm model explorer☆42Updated last year
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆72Updated last year
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆62Updated last year
- ☆65Updated 2 years ago
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆44Updated last year
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆91Updated 2 years ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆31Updated last year
- Easily run PyTorch on multiple GPUs & machines☆54Updated 3 weeks ago
- MatFormer repo☆66Updated last year
- LL3M: Large Language and Multi-Modal Model in Jax☆74Updated last year
- Explorations into adversarial losses on top of autoregressive loss for language modeling☆40Updated this week
- This is the repo for the paper "PANGEA: A FULLY OPEN MULTILINGUAL MULTIMODAL LLM FOR 39 LANGUAGES"☆117Updated 6 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆52Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆61Updated last year
- ☆81Updated last month
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆124Updated 4 months ago