clinicalml / onboarding_human_ai
Onboarding Humans to work with AI: Algorithms to find regions and describe them in natural language that show how humans should collaborate with AI (NeurIPS23)
☆12Updated 10 months ago
Alternatives and similar repositories for onboarding_human_ai:
Users that are interested in onboarding_human_ai are comparing it to the libraries listed below
- PyTorch Implementation of Attention Prompt Tuning: Parameter-Efficient Adaptation of Pre-Trained Models for Action Recognition☆13Updated 10 months ago
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆51Updated last year
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆35Updated last year
- ☆23Updated 2 months ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆51Updated last year
- NExT-GPT: Any-to-Any Multimodal Large Language Model☆19Updated 2 months ago
- ☆66Updated last year
- Edit and Generate Anything in 3D world!☆13Updated last year
- ☆62Updated this week
- ☆12Updated 9 months ago
- hierarchical multi-agent workflow for prompt optimazation☆12Updated 7 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆72Updated 4 months ago
- ☆72Updated 8 months ago
- Multimodal-Procedural-Planning☆91Updated last year
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆18Updated 2 years ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆45Updated last month
- Official Code of IdealGPT☆34Updated last year
- TallyQA: Answering Complex Counting Questions dataset☆20Updated 11 months ago
- ☆39Updated 6 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated 10 months ago
- ☆48Updated last year
- [CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"☆52Updated last year
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆25Updated this week
- A huge dataset for Document Visual Question Answering☆15Updated 6 months ago
- Self-hosted GPT-4V api☆29Updated last year
- Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"☆24Updated 6 months ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Updated last year
- ☆13Updated last year
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆41Updated this week
- Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize various modes of interaction…☆74Updated last year