clinicalml / onboarding_human_ai
Onboarding Humans to work with AI: Algorithms to find regions and describe them in natural language that show how humans should collaborate with AI (NeurIPS23)
☆12Updated last year
Alternatives and similar repositories for onboarding_human_ai:
Users that are interested in onboarding_human_ai are comparing it to the libraries listed below
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆10Updated this week
- PyTorch Implementation of Attention Prompt Tuning: Parameter-Efficient Adaptation of Pre-Trained Models for Action Recognition☆14Updated last year
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆36Updated last year
- This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs☆30Updated last month
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆21Updated 3 weeks ago
- ☆13Updated 7 months ago
- An interactive demo based on Segment-Anything for stroke-based painting which enables human-like painting.☆35Updated 2 years ago
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆29Updated 2 weeks ago
- ☆19Updated 8 months ago
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆51Updated last year
- An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries!☆42Updated last year
- ☆14Updated last year
- ☆13Updated last year
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆42Updated last month
- Official Repo for MageBench: Bridging Large Multimodal Models to Agents☆21Updated 3 months ago
- A forest of autonomous agents.☆19Updated 2 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search"☆24Updated last month
- Edit and Generate Anything in 3D world!☆13Updated 2 years ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆33Updated last year
- ☆14Updated last month
- ☆24Updated last year
- Luann allows you to create a LLM agent,which has complete memory module (long-term memory, short-term memory) and knowledge module(Variou…☆21Updated last month
- A swarm of LLM agents that will help you test, document, and productionize your code!☆15Updated last week
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 8 months ago
- ☆13Updated 4 months ago
- NExT-GPT: Any-to-Any Multimodal Large Language Model☆19Updated 5 months ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 10 months ago
- Multi-model video-to-text by combining embeddings from Flan-T5 + CLIP + Whisper + SceneGraph. The 'backbone LLM' is pre-trained from scra…☆53Updated 2 years ago
- ☆66Updated last year
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆30Updated 3 months ago