apple / ml-4m
4M: Massively Multimodal Masked Modeling
☆1,666Updated 3 months ago
Alternatives and similar repositories for ml-4m:
Users that are interested in ml-4m are comparing it to the libraries listed below
- This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.☆1,143Updated last month
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,908Updated 5 months ago
- This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinf…☆792Updated last month
- Schedule-Free Optimization in PyTorch☆2,061Updated last month
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,093Updated 3 weeks ago
- Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.☆2,516Updated 3 weeks ago
- PyTorch code and models for V-JEPA self-supervised learning from video.☆2,745Updated 5 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆831Updated last month
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expert…☆1,337Updated last month
- streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL☆1,427Updated this week
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆2,752Updated last week
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆1,823Updated 2 months ago
- ☆3,272Updated 3 months ago
- A suite of image and video neural tokenizers☆1,478Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models☆1,327Updated 2 months ago
- LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning☆1,739Updated last week
- 👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]☆596Updated 10 months ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,217Updated last month
- Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024☆1,431Updated 6 months ago
- Code for BLT research paper☆1,314Updated this week
- PyTorch native quantization and sparsity for training and inference☆1,753Updated this week
- A PyTorch native library for large model training☆3,091Updated this week
- Official repository for "AM-RADIO: Reduce All Domains Into One"☆892Updated this week
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆853Updated 5 months ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,256Updated 9 months ago
- Next-Token Prediction is All You Need☆1,965Updated 2 months ago
- Large Concept Models: Language modeling in a sentence representation space☆1,713Updated this week
- MINT-1T: A one trillion token multimodal interleaved dataset.☆788Updated 5 months ago
- Hiera: A fast, powerful, and simple hierarchical vision transformer.☆940Updated 10 months ago