AI-Application-and-Integration-Lab / PDSegLinks
[ICASSP 2025] PDSeg: Patch-Wise Distillation and Controllable Image Generation for Weakly-Supervised Histopathology Tissue Segmentation
☆16Updated 2 weeks ago
Alternatives and similar repositories for PDSeg
Users that are interested in PDSeg are comparing it to the libraries listed below
Sorting:
- [ICIP 2024] Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model☆16Updated 2 weeks ago
- ☆13Updated 2 years ago
- Scene-Text-Detection-And-Recognition-Model_M504☆25Updated 9 months ago
- ☆13Updated last year
- A Survey on Multimodal Retrieval-Augmented Generation☆211Updated this week
- 📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).☆709Updated last month
- [NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning☆96Updated 5 months ago
- Codes and Datasets for the Paper: Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extracti…☆12Updated last year
- ☆65Updated last year
- Generative Fusion Decoding (GFD) is a novel framework for integrating Large Language Models (LLMs) into multi-modal text recognition syst…☆82Updated last week
- A Traditional-Chinese instruction-following model with datasets based on Alpaca.☆137Updated 2 years ago
- A distributed training framework for large language models powered by Lightning.☆22Updated 2 months ago
- [CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…☆341Updated 9 months ago
- 🔥🔥MLVU: Multi-task Long Video Understanding Benchmark☆201Updated last week
- Document Artifical Intelligence☆170Updated last month
- ☆201Updated last month
- EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural langu…☆107Updated last year
- An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.☆784Updated this week
- 聯發創新基地(MediaTek Research) 致力於研究基礎模型。我們將研究體現在適合繁體中文使用者的模型上,並在使用權許可的情況下,提供模型給學術界研 究或產業界使用。☆234Updated 3 months ago
- An open-source implementaion for fine-tuning Pixtral by MistralAI.☆16Updated 4 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆151Updated 8 months ago
- Deep Learning for Computer Vision 深度學習於電腦視覺 by Frank Wang 王鈺強☆21Updated 11 months ago
- A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Auto…☆205Updated last month
- [ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation☆791Updated 8 months ago
- A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision,…☆297Updated 3 months ago
- 😎 curated list of awesome LMM hallucinations papers, methods & resources.☆149Updated last year
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- ☆362Updated 3 months ago
- Official implementation of our LREC-COLING 2024 paper "Generative Multimodal Entity Linking".☆34Updated 3 months ago
- Visualizing the attention of vision-language models☆181Updated 3 months ago