Aleph-Alpha / magma
MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website https://app.aleph-alpha.com
☆478Updated last year
Related projects ⓘ
Alternatives and complementary repositories for magma
- ☆350Updated 2 years ago
- Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch☆524Updated 11 months ago
- Used for adaptive human in the loop evaluation of language and embedding models.☆304Updated last year
- Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training☆165Updated last year
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways☆821Updated 2 years ago
- Aim for the moon. If you miss, you may hit a star.☆160Updated last year
- Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors☆334Updated 2 years ago
- Python Client for the Aleph Alpha API☆90Updated this week
- ☆508Updated 9 months ago
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.☆164Updated 7 months ago
- Language Modeling with the H3 State Space Model☆514Updated last year
- Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch☆1,217Updated 2 years ago
- Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...☆310Updated 11 months ago
- Ask Me Anything language model prompting☆539Updated last year
- A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick☆287Updated 11 months ago
- Repository for "Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search"☆180Updated 3 years ago
- O-GIA is an umbrella for research, infrastructure and projects ecosystem that should provide open source, reproducible datasets, models, …☆91Updated last year
- Internet Explorer explores the web in a self-supervised manner to progressively find relevant examples that improve performance on a desi…☆163Updated last year
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆907Updated 5 months ago
- Open-AI's DALL-E for large scale training in mesh-tensorflow.☆434Updated 2 years ago
- Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch☆852Updated last year
- ☆138Updated last year
- A suite of tools for managing crowdsourcing tasks from the inception through to data packaging for research use.☆305Updated this week
- Pretrained Dalle2 from laion☆500Updated last year
- MinImagen: A minimal implementation of the Imagen text-to-image model☆296Updated last year
- 🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".☆478Updated last year
- Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch☆546Updated last year
- ☆128Updated 2 years ago
- Large-scale pretrained models for goal-directed dialog☆857Updated 11 months ago