nengelmann / Fuyu-8B---ExplorationView external linksLinks
Exploration of the multi modal fuyu-8b model of Adept. 🤓 🔍
☆27Nov 7, 2023Updated 2 years ago
Alternatives and similar repositories for Fuyu-8B---Exploration
Users that are interested in Fuyu-8B---Exploration are comparing it to the libraries listed below
Sorting:
- Repository for initial POC NLP based SQL adapter using LLM.☆10May 6, 2025Updated 9 months ago
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Feb 5, 2024Updated 2 years ago
- [EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding☆49Jan 9, 2024Updated 2 years ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆40Jun 22, 2024Updated last year
- ☆18May 14, 2024Updated last year
- ☆15Apr 28, 2023Updated 2 years ago
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- ☆86Feb 5, 2024Updated 2 years ago
- SUPERVAIZER is a toolkit built for the age of AI interoperability. At its core, it implements Google's Agent-to-Agent (A2A) protocol, ena…☆14Feb 4, 2026Updated last week
- Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本…☆26Jan 26, 2024Updated 2 years ago
- Use one line code to call SadTalker API with modelscope☆24Nov 18, 2023Updated 2 years ago
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆31May 29, 2023Updated 2 years ago
- 专门用于处理视觉丰富文档转换后md文件的rag系统☆10Mar 16, 2025Updated 10 months ago
- A collection of pre-build wrappers over common RAG systems like ChromaDB, Weaviate, Pinecone, and othersz!☆42Oct 27, 2025Updated 3 months ago
- Self hosted AI workflow for scraping Instagram Reels (audio and description). Extracting, summarising and categorising, then storing all …☆27Jan 10, 2026Updated last month
- 排班管理系统☆11Jul 15, 2015Updated 10 years ago
- Official code for `Visual Attention Emerges from Recurrent Sparse Reconstruction' (ICML 2022)☆36Jul 5, 2022Updated 3 years ago
- This is a carpooling (ride sharing) app built with Flutter 💙 for Android and iOS☆13Updated this week
- AI开发者平台。目的是要搭建一个采集视频图像并调用API进行智能化数据标注,训练完成之后进行自动化测试的平台。☆34Mar 16, 2018Updated 7 years ago
- 基于paddlex目标检测的工业场景下违规使用手机识别。☆11Jun 11, 2022Updated 3 years ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- ☆111Jan 8, 2025Updated last year
- GRiT: A Generative Region-to-text Transformer for Object Understanding (ECCV2024)☆340Jan 8, 2024Updated 2 years ago
- Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"☆35Dec 5, 2022Updated 3 years ago
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge☆153Sep 3, 2025Updated 5 months ago
- [ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual…☆82Feb 22, 2025Updated 11 months ago
- GAIIC2024无人机视角下的双光目标检测 - Rank6 解决方案☆11Jun 17, 2024Updated last year
- 一个用YOLO足球视频分析的任务,检测视频中的人与球。 A task of football video analysis to detect people and balls in the video with YOLO☆12Sep 5, 2020Updated 5 years ago
- BanterBot: An OpenAI ChatGPT-powered chatbot with Azure Neural Voices. Supports multilingual speech-to-text and text-to-speech interactio…☆11Jan 23, 2026Updated 3 weeks ago
- Reinforcement Learning Strategy for FreqAI with 91.78% win rate☆26Jul 16, 2025Updated 6 months ago
- Low-Code platform for developers☆12Sep 27, 2024Updated last year
- YT2Brief: Transcribe and summarize YouTube videos using Langchain with power of LLMs.☆11Dec 21, 2023Updated 2 years ago
- [WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection☆16Mar 23, 2025Updated 10 months ago
- 本项目基于RuoYi-Vue框架为xiaozhi-esp32提供Java后端聊天服务器。帮助个人、企业快速部署的xiaozhi-esp32后端服务。☆21Jun 19, 2025Updated 7 months ago
- I-CHING package(Python周易占卜)☆10Feb 22, 2021Updated 4 years ago
- ☆10Jul 2, 2021Updated 4 years ago
- 基于 yolomark的半自动化标注工具☆13May 5, 2019Updated 6 years ago
- An AI-powered tool that translates plain English commands into multi-step API workflows, automating the entire testing process.☆17Jul 27, 2025Updated 6 months ago
- A Hierarchical Approach for Generating Descriptive Image Paragraphs☆10Mar 27, 2020Updated 5 years ago