DLYuanGod / TinyGPT-V
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
☆1,246Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for TinyGPT-V
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,333Updated 6 months ago
- MobiLlama : Small Language Model tailored for edge devices☆593Updated 8 months ago
- Mixture-of-Experts for Large Vision-Language Models☆1,971Updated 5 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆960Updated 3 months ago
- 【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection☆2,966Updated last month
- An Open-source Toolkit for LLM Development☆2,717Updated 5 months ago
- MINT-1T: A one trillion token multimodal interleaved dataset.☆770Updated 3 months ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,385Updated 8 months ago
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆592Updated last week
- Emu Series: Generative Multimodal Models from BAAI☆1,658Updated last month
- 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)☆807Updated 3 months ago
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expert…☆1,245Updated 2 weeks ago
- YaRN: Efficient Context Window Extension of Large Language Models☆1,337Updated 6 months ago
- ☆699Updated 8 months ago
- Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"☆852Updated 7 months ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,823Updated 3 months ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,117Updated this week
- VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and…☆1,968Updated last week
- Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)☆2,625Updated 2 months ago
- Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation☆913Updated last week
- ☆2,815Updated 3 weeks ago
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆824Updated 6 months ago
- Strong and Open Vision Language Assistant for Mobile Devices☆1,032Updated 6 months ago
- Next-Token Prediction is All You Need☆1,786Updated 2 weeks ago
- This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Model…☆696Updated 6 months ago
- 4M: Massively Multimodal Masked Modeling☆1,600Updated last month
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,000Updated 9 months ago
- PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"☆523Updated 10 months ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,034Updated 6 months ago