XinyuSun / mlc-chatbot

python interface for mlc chat cli

☆15

Related projects ⓘ

Alternatives and complementary repositories for mlc-chatbot

ZrrSkywalker / LLaMA-Adapter
Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
☆85Updated last year
OFA-Sys / TouchStone
Touchstone: Evaluating Vision-Language Models by Language Models
☆77Updated 9 months ago
sanjayss34 / codevqa
☆83Updated last year
X2FD / LVIS-INSTRUCT4V
☆131Updated 10 months ago
Hxyou / IdealGPT
Official Code of IdealGPT
☆32Updated last year
OpenGVLab / GUI-Odyssey
GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes fr…
☆65Updated 4 months ago
RupertLuo / Valley
The official repository of "Video assistant towards large language model makes everything easy"
☆210Updated 8 months ago
mlfoundations / VisIT-Bench
☆45Updated last year
RLHF-V / RLAIF-V
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
☆233Updated last week
OpenGVLab / Awesome-LLM4Tool
A curated list of the papers, repositories, tutorials, and anythings related to the large language models for tools
☆64Updated last year
cliangyu / Cola
[NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"
☆102Updated last year
feizc / Visual-LLaMA
Open LLaMA Eyes to See the World
☆175Updated last year
zzxslp / MM-Navigator
GPT-4V in Wonderland: LMMs as Smartphone Agents
☆128Updated 3 months ago
yiye3 / GUICourse
GUICourse: From General Vision Langauge Models to Versatile GUI Agents
☆80Updated 3 months ago
showlab / assistgpt
☆65Updated last year
bytedance / lynx-llm
paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/
☆229Updated last year
THUDM / VisualAgentBench
Towards Large Multimodal Models as Visual Foundation Agents
☆114Updated 2 weeks ago
yuweihao / MM-Vet
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
☆264Updated last week
FuxiaoLiu / LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆255Updated 8 months ago
YujieLu10 / LLMScore
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
☆125Updated last year
OpenGVLab / MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆98Updated 3 weeks ago
huggingface / OBELICS
Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…
☆188Updated 2 months ago
pipilurj / G-LLaVA
Official github repo of G-LLaVA
☆121Updated 5 months ago
cooelf / Auto-GUI
Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)
☆196Updated 3 months ago
open-compass / MMBench
Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"
☆164Updated 2 months ago
lucidrains / soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
☆243Updated 6 months ago
njucckevin / SeeClick
The model, data and code for the visual GUI Agent SeeClick
☆219Updated 2 months ago
yfzhang114 / SliME
✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models
☆137Updated this week
dhansmair / flamingo-mini
Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training
☆164Updated last year
BAAI-DCAI / Visual-Instruction-Tuning
SVIT: Scaling up Visual Instruction Tuning
☆163Updated 4 months ago