yuxiaochen1103/FDT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yuxiaochen1103/FDT)

yuxiaochen1103 / FDT

☆60

Alternatives and similar repositories for FDT

Users that are interested in FDT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LzVv123456 / I2CL
View on GitHub
☆41May 24, 2024Updated 2 years ago
LzVv123456 / Contrastive-Prototypical-Prompt
View on GitHub
☆20Mar 12, 2025Updated last year
statho / animals3d
View on GitHub
Animals3D: Learning Articulated Shape with Keypoint Pseudo-labels from Web Images (CVPR 2023)
☆15May 20, 2024Updated 2 years ago
YCaigogogo / CODER
View on GitHub
☆22Apr 27, 2024Updated 2 years ago
adobe-research / llava-score
View on GitHub
☆11Oct 2, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cv516Buaa / OV-VG
View on GitHub
☆31Mar 25, 2024Updated 2 years ago
dfki-av / MiKASA-3DVG
View on GitHub
[CVPR'24] MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding
☆18Dec 13, 2024Updated last year
StanfordMIMI / villa
View on GitHub
[ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data
☆45Oct 15, 2023Updated 2 years ago
MCR-PEFT / Ex-MCR
View on GitHub
☆44Updated this week
jaeseokbyun / GRIT-VLP
View on GitHub
This is an official implementation of GRIT-VLP
☆20Aug 8, 2022Updated 3 years ago
Ruxie189 / WSS_POLE
View on GitHub
This is the official implementation of our PrOmpt cLass lEarning (POLE).
☆12Jan 21, 2024Updated 2 years ago
taolinzhang / 3DVLP
View on GitHub
[AAAI2024] An official pytorch implement of the paper: Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Underst…
☆13Dec 8, 2024Updated last year
SalesforceAIResearch / strefer
View on GitHub
Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data
☆19Jun 2, 2026Updated last month
SivanDoveh / DAC
View on GitHub
Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models
☆28Nov 29, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
LaVi-Lab / Visual-Table
View on GitHub
[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"
☆20Oct 17, 2024Updated last year
vinid / neg_clip
View on GitHub
NegCLIP.
☆41Feb 6, 2023Updated 3 years ago
facebookresearch / clip-rocket
View on GitHub
Code release for "Improved baselines for vision-language pre-training"
☆63May 6, 2024Updated 2 years ago
jeykigung / HiCLIP
View on GitHub
☆31Mar 2, 2023Updated 3 years ago
deepglint / ALIP
View on GitHub
[ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
☆106Sep 18, 2023Updated 2 years ago
PytorchConnectomics / em_util
View on GitHub
toolbox utility functions
☆10Jun 8, 2026Updated last month
MCR-PEFT / C-MCR
View on GitHub
☆44Updated this week
wuw2019 / LoTLIP
View on GitHub
[NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
☆49Jan 14, 2025Updated last year
eslambakr / CoT3D_VG
View on GitHub
Chain_of_Thoughts_3D_Visual_Grounding
☆21Apr 20, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
linzhiqiu / cross_modal_adaptation
View on GitHub
Cross-modal few-shot adaptation with CLIP
☆352Apr 29, 2025Updated last year
ArrowLuo / VideoFeatureExtractor
View on GitHub
Video Feature Extractor for S3D-HowTo100M
☆29Apr 30, 2021Updated 5 years ago
htyao89 / Textual-based_Class-aware_prompt_tuning
View on GitHub
☆33Mar 7, 2024Updated 2 years ago
DavidMChan / clair
View on GitHub
CLAIR: A (surprisingly) simple semantic text metric with large language models.
☆22Jan 28, 2024Updated 2 years ago
wengzejia1 / Open-VCLIP
View on GitHub
☆119Feb 19, 2024Updated 2 years ago
jusiro / CLAP
View on GitHub
[CVPR'24] Validation-free few-shot adaptation of CLIP, using a well-initialized Linear Probe (ZSLP) and class-adaptive constraints (CLAP)…
☆84Jun 7, 2025Updated last year
mightyzau / RegionBLIP
View on GitHub
☆59Aug 7, 2023Updated 2 years ago
LHL3341 / ContextBLIP
View on GitHub
ContextBLIP : Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions [ACL 2024]
☆11May 17, 2024Updated 2 years ago
naver-ai / augsub
View on GitHub
[CVPR 2025] Official PyTorch implementation of MaskSub "Masking meets Supervision: A Strong Learning Alliance"
☆46Mar 25, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lancopku / clip-openness
View on GitHub
[ACL 2023] Delving into the Openness of CLIP
☆24Jan 11, 2023Updated 3 years ago
SHTUPLUS / Pix2Grp_CVPR2024
View on GitHub
☆71Nov 7, 2024Updated last year
haojinw0027 / MedFrameQA
View on GitHub
MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning
☆18Jun 6, 2025Updated last year
tingyu215 / TS-LLaVA
View on GitHub
TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models
☆17Jan 2, 2025Updated last year
liunian-harold-li / DesCo
View on GitHub
☆30Mar 13, 2024Updated 2 years ago
soCzech / MultiTaskObjectStates
View on GitHub
Code for the paper "Multi-Task Learning of Object States and State-Modifying Actions from Web Videos" published in TPAMI
☆11Mar 3, 2024Updated 2 years ago
mrwu-mac / R-Bench
View on GitHub
[ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'
☆24Jan 1, 2025Updated last year