langgptai / Awesome-Multimodal-Prompts
Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.
☆249Updated last year
Alternatives and similar repositories for Awesome-Multimodal-Prompts:
Users that are interested in Awesome-Multimodal-Prompts are comparing it to the libraries listed below
- ☆66Updated last year
- BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs☆508Updated last year
- [ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs☆429Updated 3 months ago
- [CVPR 2025] Video Narration as Vocabulary & Video as Long Document☆568Updated last month
- ☆425Updated 7 months ago
- A curated list of awesome projects and resources related to autonomous AI agents.☆279Updated last year
- Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.☆271Updated last year
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆300Updated last year
- This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reaso…☆356Updated last year
- [TLLM'23] PandaGPT: One Model To Instruction-Follow Them All☆783Updated last year
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆347Updated last year
- ☆126Updated last year
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆228Updated last year
- GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the u…☆772Updated last year
- 🤖 Awesome list of AGI Agents. Agents 精选资源合集.☆387Updated last year
- Sora AI Awesome List – Your go-to resource hub for all things Sora AI, OpenAI's groundbreaking model for crafting realistic scenes from t…☆235Updated 2 months ago
- Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters☆88Updated last year
- ☆249Updated last year
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆739Updated last year
- Official repo for MM-REACT☆947Updated last year
- HPT - Open Multimodal LLMs from HyperGAI☆315Updated 10 months ago
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆138Updated 5 months ago
- Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"☆864Updated 4 months ago
- 🚀🚀🚀A collection of some awesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VL…☆669Updated last week
- ☆181Updated last year
- Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…☆517Updated last year
- The official codes for "Aurora: Activating chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning"☆261Updated 11 months ago
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆192Updated 9 months ago
- Official implementation of SEED-LLaMA (ICLR 2024).☆610Updated 7 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆230Updated last month