facebookresearch / chameleonLinks
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
☆2,068Updated last year
Alternatives and similar repositories for chameleon
Users that are interested in chameleon are comparing it to the libraries listed below
Sorting:
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆1,975Updated last month
- Next-Token Prediction is All You Need☆2,261Updated 3 weeks ago
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,907Updated last year
- This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.☆1,387Updated 4 months ago
- Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation☆814Updated 5 months ago
- 4M: Massively Multimodal Masked Modeling☆1,777Updated 6 months ago
- [ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning☆2,106Updated last month
- 【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models☆2,283Updated 4 months ago
- Codebase for Aria - an Open Multimodal Native MoE☆1,085Updated 10 months ago
- Emu Series: Generative Multimodal Models from BAAI☆1,760Updated last year
- A family of lightweight multimodal models.☆1,048Updated last year
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆931Updated 3 weeks ago
- NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024☆1,770Updated 2 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,382Updated 4 months ago
- ☆4,444Updated 2 months ago
- VideoSys: An easy and efficient system for video generation☆2,010Updated 3 months ago
- Witness the aha moment of VLM with less than $3.☆3,999Updated 6 months ago
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆1,632Updated last year
- [ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.☆1,815Updated last month
- Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI☆1,279Updated last week
- A suite of image and video neural tokenizers☆1,691Updated 10 months ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,397Updated 7 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,405Updated last year
- ☆632Updated last year
- Muon is Scalable for LLM Training☆1,378Updated 4 months ago
- VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and clou…☆3,684Updated 2 weeks ago
- Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"☆862Updated last year
- 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)☆848Updated 4 months ago
- DataComp for Language Models☆1,398Updated 3 months ago
- [ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation☆890Updated last year