langgptai / Awesome-Multimodal-Prompts
Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.
☆211Updated 10 months ago
Related projects: ⓘ
- ☆65Updated last year
- Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.☆528Updated last year
- BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs☆497Updated last year
- 🤖 Awesome list of AGI Agents. Agents 精选资源合集.☆302Updated 10 months ago
- A curated list of awesome projects and resources related to autonomous AI agents.☆269Updated 8 months ago
- ☆161Updated 2 months ago
- The next generation of Multi-Modal Multi-Agent platform.☆64Updated last month
- ☆235Updated 8 months ago
- GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the u…☆754Updated 9 months ago
- Official Repo for the Paper: CHATANYTHING: FACETIME CHAT WITH LLM-ENHANCED PERSONAS☆376Updated 9 months ago
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆128Updated 9 months ago
- ☆118Updated 9 months ago
- A simulation of world using GPTs. (depreciated)☆154Updated 7 months ago
- VCoder: Versatile Vision Encoders for Multimodal Large Language Models, arXiv 2023 / CVPR 2024☆255Updated 5 months ago
- GPT-4V in Wonderland: LMMs as Smartphone Agents☆122Updated 2 months ago
- Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.☆267Updated 11 months ago
- CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative H…☆293Updated 5 months ago
- Multimodal Models in Real World☆372Updated 2 months ago
- HPT - Open Multimodal LLMs from HyperGAI☆309Updated 3 months ago
- WebDesignAgent : Towards Effortless Website Creation☆229Updated last month
- ☆246Updated 9 months ago
- ☆145Updated 2 months ago
- An LLM-based Web Navigating Agent (KDD'24)☆581Updated 4 months ago
- ☆164Updated 2 weeks ago
- (ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life☆301Updated 2 months ago
- This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reaso…☆329Updated 9 months ago
- OpenGPTs- Powerful GPTs Colipot | 强大的gpts浏览器插件|多窗口|批量对话|chatgpt3.5|chatgpt4.0☆177Updated last month
- SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models☆113Updated this week
- The implementation of "Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4"☆136Updated 10 months ago
- [TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.☆224Updated 8 months ago