kyegomez/MultiModalCrossAttn

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kyegomez/MultiModalCrossAttn)

kyegomez / MultiModalCrossAttn

The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"

☆37

Alternatives and similar repositories for MultiModalCrossAttn

Users that are interested in MultiModalCrossAttn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kyegomez / KosmosG
View on GitHub
My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"
☆13Nov 11, 2024Updated last year
kyegomez / AlphaDev
View on GitHub
Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…
☆11Aug 29, 2023Updated 2 years ago
kyegomez / HSSS
View on GitHub
Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…
☆16Nov 11, 2024Updated last year
zkashef / ECE535-FederatedLearning
View on GitHub
Multimodal Federated Learning on IoT Data
☆11Dec 17, 2023Updated 2 years ago
kyegomez / Pegasus
View on GitHub
PegasusX: The Future of Multimodal Embeddings 🦄 🦄
☆14Oct 16, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
kyegomez / Falcon
View on GitHub
A simple package for leveraging Falcon 180B and the HF ecosystem's tools, including training/inference scripts, safetensors, integrations…
☆12Mar 11, 2024Updated 2 years ago
kyegomez / Hedgehog
View on GitHub
Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"
☆16Mar 11, 2024Updated 2 years ago
kyegomez / SimplifiedTransformers
View on GitHub
SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-bl…
☆15Updated this week
kyegomez / Qwen-VL
View on GitHub
My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…
☆13Jan 29, 2024Updated 2 years ago
ORI-Muchim / AudioSR-Upsampling
View on GitHub
AudioSR-Upsampling (any -> 48kHz)
☆42Feb 13, 2024Updated 2 years ago
kyegomez / COT-SC
View on GitHub
Plug in and Play Prompt Technique to Boost Model reasoning by 40%
☆12May 30, 2023Updated 3 years ago
The-Swarm-Corporation / swarms-core
View on GitHub
Multi-threading, Concurrency, Asynchrony, and various Execution Methods implemented in a Rust backend for bleeding edge performance.
☆20Nov 11, 2024Updated last year
kyegomez / MGQA
View on GitHub
The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…
☆17Dec 11, 2023Updated 2 years ago
kyegomez / MAGVIT2
View on GitHub
Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"
☆15Nov 11, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kyegomez / NeVA
View on GitHub
The open source implementation of "NeVA: NeMo Vision and Language Assistant"
☆17Aug 26, 2023Updated 2 years ago
kyegomez / Tiktokx
View on GitHub
Tiktok is an advanced multimedia recommender system that fuses the generative modality-aware collaborative self-augmentation and contrast…
☆14Aug 18, 2023Updated 2 years ago
kyegomez / EAOT
View on GitHub
The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"
☆19Mar 11, 2024Updated 2 years ago
kyegomez / HRTX
View on GitHub
Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2
☆15Jun 27, 2025Updated last year
kyegomez / VisualNexus
View on GitHub
An plug in and play pipeline that utilizes segment anything to segment datasets with rich detail for downstream fine-tuning on vision mod…
☆20Feb 22, 2024Updated 2 years ago
kyegomez / Ocean
View on GitHub
Ultra Fast Multi-Modality Vector Database
☆18Feb 21, 2024Updated 2 years ago
kyegomez / Exa
View on GitHub
Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…
☆27Nov 11, 2024Updated last year
kyegomez / ProfitPilot
View on GitHub
ProfitPilot closes deals for you effortlessly 24/7, just provide a list of customer and ProfitPilot will reach out on your behalf and clo…
☆21Sep 7, 2023Updated 2 years ago
asteroid-team / pytorch_stoi
View on GitHub
STOI loss functions in PyTorch (mirror of https://github.com/mpariente/pytorch_stoi)
☆15Aug 6, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
kyegomez / autogpt-tot
View on GitHub
Simple Autogpt with tree of thoughts
☆14May 25, 2023Updated 3 years ago
kyegomez / GPT3
View on GitHub
An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"
☆22Jun 29, 2024Updated 2 years ago
The-Swarm-Corporation / swarm-ecosystem
View on GitHub
The Swarm Ecosystem
☆29Aug 1, 2024Updated last year
kyegomez / LOGICGUIDE
View on GitHub
Plug in and Play implementation of "Certified Reasoning with Language Models" that elevates model reasoning by 40%
☆16Jun 20, 2023Updated 3 years ago
lzhangbj / ASVA
View on GitHub
[ECCV 2024 Oral] Audio-Synchronized Visual Animation
☆60Mar 15, 2026Updated 4 months ago
kyegomez / FlashMHA
View on GitHub
An simple pytorch implementation of Flash MultiHead Attention
☆22Feb 5, 2024Updated 2 years ago
The-Swarm-Corporation / swarms-cloud
View on GitHub
Deploy your autonomous agents to production grade environments with 99% Uptime Guarantee, Infinite Scalability, and self-healing.
☆54Jul 13, 2026Updated last week
LAION-AI / Text-to-speech
View on GitHub
☆61Nov 4, 2023Updated 2 years ago
kyegomez / SelfExtend
View on GitHub
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
☆13Nov 11, 2024Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
kyegomez / qformer
View on GitHub
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
☆51Nov 11, 2024Updated last year
kyegomez / Gen1
View on GitHub
My Implementation of " Structure and Content-Guided Video Synthesis with Diffusion Models" by RunwayML
☆26Jan 16, 2024Updated 2 years ago
kyegomez / MegaVIT
View on GitHub
The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"
☆32Jun 22, 2026Updated last month
Wangt-CN / Code_CASC
View on GitHub
☆14Oct 14, 2019Updated 6 years ago
kyegomez / Paper-Implementation-Template
View on GitHub
A simple reproducible template to implement AI research papers
☆24Sep 9, 2024Updated last year
MODAL-UNINA / Federated-Learning-based-Fall-Detection-with-Multimodal-Data-Fusion
View on GitHub
☆12Jan 23, 2024Updated 2 years ago
philgzl / brever
View on GitHub
Speech enhancement in noisy and reverberant environments using deep neural networks
☆23Oct 10, 2025Updated 9 months ago