mlfoundations / MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
☆805Updated 7 months ago
Alternatives and similar repositories for MINT-1T:
Users that are interested in MINT-1T are comparing it to the libraries listed below
- Understanding R1-Zero-Like Training: A Critical Perspective☆568Updated this week
- Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.☆550Updated 4 months ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,269Updated 11 months ago
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆733Updated last year
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,269Updated last month
- Muon is Scalable for LLM Training☆974Updated 3 weeks ago
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆682Updated 4 months ago
- Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation☆733Updated 7 months ago
- Open weights language model from Google DeepMind, based on Griffin.☆628Updated last month
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,374Updated 11 months ago
- HPT - Open Multimodal LLMs from HyperGAI☆314Updated 9 months ago
- LIMO: Less is More for Reasoning☆864Updated last month
- [ICML 2024] CLLMs: Consistency Large Language Models☆388Updated 4 months ago
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expert…☆1,375Updated 2 weeks ago
- LLaVA-UHD v2: an MLLM Integrating High-Resolution Semantic Pyramid via Hierarchical Window Transformer☆370Updated last week
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,484Updated last year
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆921Updated last week
- nanoGPT style version of Llama 3.1☆1,346Updated 7 months ago
- OLMoE: Open Mixture-of-Experts Language Models☆693Updated last week
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆857Updated last month
- Reference implementation of Megalodon 7B model☆516Updated 11 months ago
- Minimalistic large language model 3D-parallelism training☆1,715Updated this week
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆706Updated 6 months ago
- Visualize the intermediate output of Mistral 7B☆345Updated 2 months ago
- Serving multiple LoRA finetuned LLM as one☆1,042Updated 10 months ago
- [NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which r…☆945Updated last month
- PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"☆575Updated last year
- DataComp for Language Models☆1,267Updated last week
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆646Updated 9 months ago
- Recipes to scale inference-time compute of open models☆1,044Updated last month