A self-contained multimodal AI agents lab built using MongoDB, Gemini and LangGraph.
☆67Sep 23, 2025Updated 5 months ago
Alternatives and similar repositories for multimodal-agents-lab
Users that are interested in multimodal-agents-lab are comparing it to the libraries listed below
Sorting:
- GitHub Action for building an ARM Template from Bicep☆13Jun 18, 2022Updated 3 years ago
- Creates an Azure AI Studio hub, project and required dependent resources including Azure Open AI Service, Cognitive Search and more.☆32Oct 2, 2024Updated last year
- A Pytorch implementing of A Deep Learning approach to Template Matching. Usie Hypernet + VGG to match the templates.☆12Dec 18, 2021Updated 4 years ago
- Basic codes of ml☆13Dec 2, 2019Updated 6 years ago
- [ICLR 2024] Towards Robust Multi-Modal Reasoning via Model Selection☆15Mar 7, 2024Updated 2 years ago
- Resources for Machine Learning and AI☆16Dec 23, 2018Updated 7 years ago
- Multimodal Genuine Emotion and Expression Detection database☆12Jul 15, 2024Updated last year
- ☆13May 5, 2022Updated 3 years ago
- ☆59Apr 2, 2025Updated 11 months ago
- Samples, quickstarts, and developer resources for Azure Durable Task Scheduler — build reliable, fault-tolerant workflows with Durable Fu…☆48Mar 15, 2026Updated last week
- Vehicle number plate detection using YOLO and OCR text extraction☆14Feb 8, 2020Updated 6 years ago
- Video Summarization - Summarized a video lecture and converted it to a slideshow using Speech-to-text, Keyword extraction and OpenCV Shot…☆19Apr 13, 2018Updated 7 years ago
- List of publicly available Deep Learning courses☆15Aug 30, 2015Updated 10 years ago
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated 2 months ago
- GeneticFS is a library for feature selection in Machine Learning using a Genetic Algorithm as an optimisation method.☆20Oct 8, 2019Updated 6 years ago
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆26May 31, 2025Updated 9 months ago
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- ☆13Feb 5, 2025Updated last year
- ☆66Feb 5, 2024Updated 2 years ago
- ☆27Jan 5, 2026Updated 2 months ago
- 基于InternLm chat 7B大模型基座,构建一个Agent ,可以调用 MMYOLO 工具来完成图像内视觉任务☆11Oct 30, 2024Updated last year
- Line and Word Segmentation for Bangla Handwritten Text Recognition☆17Sep 18, 2023Updated 2 years ago
- Our internal Svelte UI library.☆12Mar 28, 2024Updated last year
- Karras et al. (2022) diffusion models for PyTorch☆17Oct 5, 2023Updated 2 years ago
- Bengali transformer using transformers☆22Apr 29, 2025Updated 10 months ago
- It is based on Image Processing. It has been implemented with "Python3" and "OpenCv3.3.0"☆19Oct 1, 2019Updated 6 years ago
- [ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videos☆25Aug 8, 2025Updated 7 months ago
- A browser based CadQuery server☆12Feb 18, 2025Updated last year
- CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors☆26Jun 2, 2022Updated 3 years ago
- Code for BMVC2020 paper "Text and Style Conditioned GAN for Generation of Offline Handwriting Lines"☆75Feb 13, 2023Updated 3 years ago
- Using multiple LLMs for ensemble Forecasting☆16Jan 17, 2024Updated 2 years ago
- Solutions to various UVa (ACM) problems in python 3.