AILab-CVC / GPT4ToolsLinks

GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.

☆775

Alternatives and similar repositories for GPT4Tools

Users that are interested in GPT4Tools are comparing it to the libraries listed below

Sorting:

microsoft / MM-REACT
Official repo for MM-REACT
☆953Updated last year
yxuansu / PandaGPT
[TLLM'23] PandaGPT: One Model To Instruction-Follow Them All
☆808Updated 2 years ago
lupantech / chameleon-llm
Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
☆1,134Updated last year
eric-ai-lab / MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
☆861Updated 2 months ago
open-mmlab / Multimodal-GPT
Multimodal-GPT
☆1,506Updated 2 years ago
ctlllll / LLM-ToolMaker
☆1,033Updated 2 years ago
InternLM / InternLM-techreport
☆905Updated 2 years ago
LLaVA-VL / LLaVA-Plus-Codebase
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
☆751Updated last year
OptimalScale / DetGPT
☆780Updated 11 months ago
Victorwz / LongMem
Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".
☆802Updated last year
showlab / VLog
[CVPR 2025] Video Narration as Vocabulary & Video as Long Document
☆574Updated 4 months ago
luogen1996 / LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
☆522Updated last year
VPGTrans / VPGTrans
Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.
☆273Updated last year
OpenLemur / Lemur
[ICLR 2024] Lemur: Open Foundation Models for Language Agents
☆552Updated last year
CStanKonrad / long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…
☆1,460Updated last year
magic-research / bubogpt
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
☆511Updated 2 years ago
OpenLMLab / LOMO
LOMO: LOw-Memory Optimization
☆988Updated last year
IBM / Dromedary
Dromedary: towards helpful, ethical and reliable LLMs.
☆1,149Updated 2 months ago
THUDM / AgentTuning
AgentTuning: Enabling Generalized Agent Abilities for LLMs
☆1,450Updated last year
arielnlee / Platypus
Code for fine-tuning Platypus fam LLMs using LoRA
☆628Updated last year
mbzuai-nlp / LaMini-LM
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
☆821Updated 2 years ago
lupantech / ScienceQA
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
☆682Updated 10 months ago
kyegomez / CM3Leon
An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal …
☆362Updated last year
showlab / Image2Paragraph
[Image 2 Text Para] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
☆814Updated 2 years ago
allenai / mmc4
MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
☆935Updated 4 months ago
VITA-MLLM / Woodpecker
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
☆639Updated 7 months ago
OpenGVLab / Multi-Modality-Arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…
☆531Updated last year
ruixiangcui / AGIEval
☆758Updated last year
GAIR-NLP / factool
FacTool: Factuality Detection in Generative AI
☆881Updated 11 months ago
Xwin-LM / Xwin-LM
Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment
☆1,040Updated last year