showlab / assistgptLinks

☆66

Alternatives and similar repositories for assistgpt

Users that are interested in assistgpt are comparing it to the libraries listed below

Sorting:

OpenGVLab / ControlLLM
ControlLLM: Augment Language Models with Tools by Searching on Graphs
☆193Updated last year
ZrrSkywalker / LLaMA-Adapter
Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆90Updated 2 years ago
patrick-tssn / Awesome-Colorful-LLM
Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics,…
☆123Updated 5 months ago
YujieLu10 / TIP
Multimodal-Procedural-Planning
☆92Updated 2 years ago
OpenGVLab / Awesome-LLM4Tool
A curated list of the papers, repositories, tutorials, and anythings related to the large language models for tools
☆68Updated 2 years ago
zzxslp / MM-Navigator
GPT-4V in Wonderland: LMMs as Smartphone Agents
☆135Updated last year
OFA-Sys / TouchStone
Touchstone: Evaluating Vision-Language Models by Language Models
☆83Updated last year
icoz69 / StableLLAVA
Official repo for StableLLAVA
☆94Updated last year
mshukor / UnIVAL
[TMLR23] Official implementation of UnIVAL: Unified Model for Image, Video, Audio and Language Tasks.
☆232Updated last year
Hxyou / IdealGPT
Official Code of IdealGPT
☆35Updated 2 years ago
sanjayss34 / codevqa
☆84Updated 2 years ago
VPGTrans / VPGTrans
Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.
☆271Updated 2 years ago
gpt4video / GPT4Video
Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation
☆143Updated last year
pkunlp-icler / PCA-EVAL
[ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
☆103Updated last year
feizc / Visual-LLaMA
Open LLaMA Eyes to See the World
☆174Updated 2 years ago
FudanNLPLAB / MouSi
☆75Updated last year
HaozheZhao / MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
☆357Updated last year
neulab / MultiUI
Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding
☆53Updated 11 months ago
thunlp / Muffin
☆66Updated last year
mlfoundations / VisIT-Bench
☆50Updated 2 years ago
cg1177 / VideoLLM
VideoLLM: Modeling Video Sequence with Large Language Models
☆158Updated 2 years ago
camille-vanhoffelen / langchain-huggingGPT
Langchain implementation of HuggingGPT
☆133Updated 2 years ago
showlab / Awesome-Long-Context
A curated list of resources about long-context in large-language models and video understanding.
☆31Updated 2 years ago
cliangyu / Cola
[NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"
☆103Updated 2 years ago
jihaonew / MM-Instruct
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
☆35Updated last year
MBZUAI-LLM / web2code
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
☆94Updated last year
Dongping-Chen / GUI-World
(ICLR 2025) The Official Code Repository for GUI-World.
☆67Updated 11 months ago
YujieLu10 / LLMScore
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
☆133Updated 2 years ago
HyperGAI / HPT
HPT - Open Multimodal LLMs from HyperGAI
☆315Updated last year
zzxslp / SoM-LLaVA
[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
☆145Updated last year