langgptai / Awesome-Multimodal-Prompts
Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.
☆240Updated last year
Alternatives and similar repositories for Awesome-Multimodal-Prompts:
Users that are interested in Awesome-Multimodal-Prompts are comparing it to the libraries listed below
- A curated list of awesome projects and resources related to autonomous AI agents.☆276Updated last year
- BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs☆505Updated last year
- Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.☆548Updated last year
- [ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs☆414Updated 3 weeks ago
- GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the u…☆766Updated last year
- ☆400Updated 4 months ago
- Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.☆271Updated last year
- ☆173Updated 7 months ago
- ☆65Updated last year
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆722Updated last year
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆137Updated 3 months ago
- Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".☆135Updated 3 weeks ago
- 🤖 Awesome list of AGI Agents. Agents 精选资源合集.☆348Updated last year
- Multimodal Models in Real World☆437Updated 3 months ago
- Sora AI Awesome List – Your go-to resource hub for all things Sora AI, OpenAI's groundbreaking model for crafting realistic scenes from t…☆230Updated 2 weeks ago
- [TLLM'23] PandaGPT: One Model To Instruction-Follow Them All☆781Updated last year
- ☆121Updated last year
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life☆334Updated 2 months ago
- This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reaso…☆348Updated last year
- Official Repo for the Paper: CHATANYTHING: FACETIME CHAT WITH LLM-ENHANCED PERSONAS☆381Updated last year
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆344Updated last year
- CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative H…☆307Updated 10 months ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆298Updated last year
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models☆272Updated 10 months ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆114Updated 3 months ago
- This repo includes all customized GPTs on openai gpt store☆118Updated last year
- ☆197Updated last year
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆189Updated 7 months ago
- 让 AI 设计 AI,让大模型帮助小模型进化,用魔法创造魔法! Empower Artificial Intelligence to sculpt its own kind, where colossal models gracefully usher the petit…☆96Updated last year
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆224Updated last year