kyegomez / Kosmos2.5View external linksLinks
My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"
☆74Feb 6, 2026Updated last week
Alternatives and similar repositories for Kosmos2.5
Users that are interested in Kosmos2.5 are comparing it to the libraries listed below
Sorting:
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"☆14Nov 11, 2024Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Jan 29, 2024Updated 2 years ago
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago
- a suite of finetuned LLMs for atomically precise function calling 🧪☆17Feb 6, 2026Updated last week
- ☆12Jun 20, 2023Updated 2 years ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- 32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.☆50Jun 16, 2023Updated 2 years ago
- Dataset for EMNLP'23 Paper "DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading"☆11Oct 25, 2023Updated 2 years ago
- recipe for training fully-featured self supervised image jepa models☆12Jun 4, 2025Updated 8 months ago
- Tool to parse wiki tables from the HTML dump of Wikipedia☆11Jun 12, 2022Updated 3 years ago
- Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.☆203Mar 1, 2025Updated 11 months ago
- ☆51May 28, 2024Updated last year
- LaTeXDataHub is an open-source platform dedicated to the sharing and contribution of real-world LaTeX image datasets and their annotation…☆12Aug 13, 2024Updated last year
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Nov 11, 2024Updated last year
- PegasusX: The Future of Multimodal Embeddings 🦄 🦄☆14Oct 16, 2024Updated last year
- This repository includes the code to download the curated HuggingFace papers into a single markdown formatted file☆16Jul 26, 2024Updated last year
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Nov 11, 2024Updated last year
- ☆14Mar 28, 2024Updated last year
- The open source implementation of "NeVA: NeMo Vision and Language Assistant"☆17Aug 26, 2023Updated 2 years ago
- Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"☆269Jun 12, 2024Updated last year
- ☆17Jun 12, 2024Updated last year
- ☆14Jan 9, 2026Updated last month
- Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2☆15Jun 27, 2025Updated 7 months ago
- ☆57Jan 23, 2024Updated 2 years ago
- Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement☆17Nov 11, 2024Updated last year
- The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"☆19Mar 11, 2024Updated last year
- Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"☆93Mar 20, 2024Updated last year
- MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer☆248Apr 3, 2024Updated last year
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆20Jun 29, 2024Updated last year
- 💥 Make peer-2-peer global works☆46Jan 29, 2026Updated 2 weeks ago
- ☆156May 8, 2025Updated 9 months ago
- An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries!☆40Feb 1, 2024Updated 2 years ago
- Simple Autogpt with tree of thoughts☆14May 25, 2023Updated 2 years ago
- CDLA: A Chinese document layout analysis (CDLA) dataset☆288Sep 13, 2021Updated 4 years ago
- A system for unsupervised knowledge-free interpretable word sense disambiguation based on distributional semantics☆19Mar 25, 2018Updated 7 years ago
- An open-source Notion-style WYSIWYG editor with AI-powered autocompletions.☆24Jul 13, 2023Updated 2 years ago
- Fast and memory-efficient exact attention☆29Dec 2, 2024Updated last year
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- The open source implementation of "AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model"☆22Jan 27, 2025Updated last year