☆31Nov 17, 2024Updated last year
Alternatives and similar repositories for MMRel
Users that are interested in MMRel are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention☆66Aug 30, 2025Updated 6 months ago
- 🚀 😂 spring cloud alibaba project☆22Dec 19, 2023Updated 2 years ago
- A股历史复盘☆24Jun 29, 2023Updated 2 years ago
- ☆18Apr 20, 2025Updated 10 months ago
- Visualize attention maps in Diffusion Models☆22Mar 10, 2025Updated 11 months ago
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆20Jan 11, 2026Updated last month
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆42Dec 16, 2025Updated 2 months ago
- A dashboard for Curio☆20Jan 13, 2026Updated last month
- NeuSyRE: A Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph Enrichment☆22Mar 10, 2024Updated last year
- 轻量级业务中台开发框架,中台设计完美实现,赋能业务。☆14Feb 25, 2023Updated 3 years ago
- 使用netty+zookeeper实现的简易版rpc框架✨☆61Jun 17, 2024Updated last year
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Sep 27, 2023Updated 2 years ago
- An official repo for WACV 2025 paper "LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spa…☆26Jan 27, 2025Updated last year
- Using different CNN models to train on GTZAN Dataset☆40Nov 14, 2023Updated 2 years ago
- Project for Polkadot Hackathon☆37Apr 2, 2022Updated 3 years ago
- Learning from Noisy Anchors for One-stage Object Detection☆27Apr 14, 2021Updated 4 years ago
- [EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding☆49Jan 9, 2024Updated 2 years ago
- [MM'24 Oral] Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval☆130Aug 23, 2024Updated last year
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆108May 29, 2025Updated 9 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆69Jun 9, 2024Updated last year
- 👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)☆74Jan 20, 2025Updated last year
- [CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.☆33Jul 12, 2023Updated 2 years ago
- Robust estimations from distribution structures: III. Non-asymptotic☆25Feb 10, 2024Updated 2 years ago
- A curated list of Egocentric Action Understanding resources☆46Nov 26, 2025Updated 3 months ago
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆34Nov 13, 2024Updated last year
- ☆32Jul 29, 2024Updated last year
- a scalable short link generation service to improve marketing efforts☆21Apr 11, 2024Updated last year
- Using reference images to control style in text-to-image diffusion models. Based on CSD and IP Adapter☆54Mar 24, 2025Updated 11 months ago
- A zk-SNARK implementation☆50Dec 18, 2022Updated 3 years ago
- ☆18Oct 19, 2024Updated last year
- php tool functions☆49Feb 6, 2022Updated 4 years ago
- This repository contains the core methods and models described in the paper “Represent Code as Action Sequence for Predicting Next Method…☆55Sep 15, 2024Updated last year
- Official implementation of TagAlign☆35Dec 11, 2024Updated last year
- 待遇任务执行器-一个简单的任务执行器☆26Mar 26, 2025Updated 11 months ago
- [WACV 2025] Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge☆38Oct 29, 2024Updated last year
- 助你快速开发网页!让世界上没有难做的网页!☆110Dec 5, 2025Updated 3 months ago
- Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification☆27May 31, 2023Updated 2 years ago
- Grsql is a great tool to allow you set up your remote sqlite database as service and CRUD(create/read/update/delete) it using gRPC.☆28Jul 8, 2022Updated 3 years ago
- NegCLIP.☆39Feb 6, 2023Updated 3 years ago