mlfoundations / MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
☆788Updated 5 months ago
Alternatives and similar repositories for MINT-1T:
Users that are interested in MINT-1T are comparing it to the libraries listed below
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,217Updated last month
- Reference implementation of Megalodon 7B model☆512Updated 8 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆965Updated 5 months ago
- Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation☆708Updated 5 months ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,425Updated 10 months ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆368Updated 2 months ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,256Updated 9 months ago
- Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.☆541Updated 2 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆831Updated last month
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,358Updated 9 months ago
- Recipes to scale inference-time compute of open models☆932Updated this week
- Open weights language model from Google DeepMind, based on Griffin.☆614Updated 6 months ago
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆797Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models☆1,327Updated 2 months ago
- nanoGPT style version of Llama 3.1☆1,290Updated 5 months ago
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆613Updated 2 months ago
- YaRN: Efficient Context Window Extension of Large Language Models☆1,398Updated 9 months ago
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expert…☆1,337Updated last month
- Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters☆477Updated this week
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆720Updated 11 months ago
- OLMoE: Open Mixture-of-Experts Language Models☆531Updated last month
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,908Updated 5 months ago
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆895Updated this week
- This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.☆1,143Updated last month
- Visualize the intermediate output of Mistral 7B☆333Updated 11 months ago
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆954Updated 9 months ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆687Updated 3 months ago
- Inference code for Persimmon-8B☆416Updated last year
- Large Reasoning Models☆787Updated last month