Sleepychord / cogdataLinks
A light-weight data management system for large-scale pretraining
☆20Updated 2 weeks ago
Alternatives and similar repositories for cogdata
Users that are interested in cogdata are comparing it to the libraries listed below
Sorting:
- ☆11Updated 10 months ago
- Touchstone: Evaluating Vision-Language Models by Language Models☆83Updated last year
- Official github repo of G-LLaVA☆141Updated 3 months ago
- Converting Mixtral-8x7B to Mixtral-[1~7]x7B☆22Updated last year
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated 11 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 6 months ago
- ☆147Updated 7 months ago
- ☆148Updated 4 months ago
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆208Updated last year
- ☆133Updated last year
- MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer☆225Updated last year
- ☆99Updated last year
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆47Updated 5 months ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆46Updated last year
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆28Updated 2 years ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆47Updated 2 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆50Updated last week
- TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)☆183Updated last year
- Source code for EMNLP2022 long paper: Parameter-Efficient Tuning Makes a Good Classification Head☆14Updated 2 years ago
- Official repository of MMDU dataset☆91Updated 8 months ago
- LVBench: An Extreme Long Video Understanding Benchmark☆90Updated 9 months ago
- ☆87Updated 11 months ago
- ☆111Updated last week
- Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models☆71Updated last year
- Lion: Kindling Vision Intelligence within Large Language Models☆52Updated last year
- Official repo for StableLLAVA☆95Updated last year
- [NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…☆53Updated last year
- ☆51Updated last year
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆58Updated last year
- VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation☆86Updated 8 months ago