wuw2019/LoTLIP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wuw2019/LoTLIP)

wuw2019 / LoTLIP

[NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding

☆49

Alternatives and similar repositories for LoTLIP

Users that are interested in LoTLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ant-research / DreamLIP
View on GitHub
[ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions
☆138May 8, 2025Updated last year
LuFan31 / CompreCap
View on GitHub
CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
☆39Mar 21, 2025Updated last year
ant-research / lumos
View on GitHub
[CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text
☆52Mar 16, 2025Updated last year
MIV-XJTU / FLAME
View on GitHub
[CVPR 2025] PyTorch implementation of paper "FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training"
☆33Jul 8, 2025Updated last year
lezhang7 / SAIL
View on GitHub
[CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"
☆60Aug 15, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
wuw2019 / R-AMT
View on GitHub
☆20Oct 19, 2023Updated 2 years ago
LuFan31 / ET-OOD
View on GitHub
CVPR2023:Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection
☆26Mar 27, 2023Updated 3 years ago
ivonajdenkoska / tulip
View on GitHub
[ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"
☆32Jan 26, 2026Updated 5 months ago
m1k2zoo / negbench
View on GitHub
Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"
☆47Feb 26, 2026Updated 4 months ago
ExplainableML / flair
View on GitHub
[CVPR 2025] FLAIR: VLM with Fine-grained Language-informed Image Representations
☆147Mar 12, 2026Updated 4 months ago
callsys / DynRefer
View on GitHub
[CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution
☆59Mar 4, 2025Updated last year
iancovert / locality-alignment
View on GitHub
☆55Jan 17, 2025Updated last year
ytaek-oh / fsc-clip
View on GitHub
[EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
☆22Oct 8, 2024Updated last year
chs20 / fuselip
View on GitHub
FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens
☆17Sep 8, 2025Updated 10 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
callsys / ControlCap
View on GitHub
[ECCV 2024] ControlCap: Controllable Region-level Captioning
☆81Oct 25, 2024Updated last year
deepglint / ALIP
View on GitHub
[ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
☆106Sep 18, 2023Updated 2 years ago
OpenGVLab / VKnowU
View on GitHub
[ECCV 2026] VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs
☆15Feb 3, 2026Updated 5 months ago
deepglint / Victor
View on GitHub
ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
☆29Aug 15, 2025Updated 11 months ago
HanSolo9682 / CounterCurate
View on GitHub
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
☆19Jun 27, 2024Updated 2 years ago
wjpoom / SPEC
View on GitHub
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆52Jun 16, 2025Updated last year
beichenzbc / Long-CLIP
View on GitHub
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
☆900Aug 13, 2024Updated last year
tkdguraa / WSOL_binarize
View on GitHub
official PyTorch implementation for "Discovering an inference recipe for weakly-supervised object localization"
☆17Aug 3, 2024Updated last year
hammoudhasan / SynthCLIP
View on GitHub
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
☆104Mar 23, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
mc-lan / ClearCLIP
View on GitHub
[ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
☆99Mar 26, 2025Updated last year
baaivision / DenseFusion
View on GitHub
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
☆159Dec 6, 2024Updated last year
tiiuae / FineLIP
View on GitHub
code for FineLIP
☆43Nov 25, 2025Updated 7 months ago
Qinying-Liu / TagAlign
View on GitHub
Official implementation of TagAlign
☆37Dec 11, 2024Updated last year
rabiulcste / vismin
View on GitHub
[NeurIPS24] VisMin: Visual Minimal-Change Understanding
☆19Mar 3, 2025Updated last year
ExplainableML / cosmos
View on GitHub
[CVPR 2025] COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
☆42Mar 27, 2025Updated last year
TIGER-AI-Lab / ABC
View on GitHub
ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]
☆19Aug 21, 2025Updated 11 months ago
OpenCausaLab / CELLO
View on GitHub
☆22Nov 5, 2024Updated last year
brendel-group / clip-ood
View on GitHub
Official code for the paper "Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?" (ICLR 2024)
☆11Aug 26, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lezhang7 / Enhance-FineGrained
View on GitHub
[CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
☆56Apr 7, 2025Updated last year
tripletclip / TripletCLIP
View on GitHub
[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"
☆48Dec 1, 2024Updated last year
lzw-lzw / UnifiedMLLM
View on GitHub
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
☆22Aug 5, 2024Updated last year
NMS05 / Patch-Aligned-Contrastive-Learning
View on GitHub
☆24Jul 8, 2023Updated 3 years ago
chunmeifeng / SPRC
View on GitHub
【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval
☆94Apr 16, 2024Updated 2 years ago
xuliu-cyber / RSUniVLM
View on GitHub
☆46Apr 16, 2026Updated 3 months ago
RAIVNLab / CREPE
View on GitHub
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆35Apr 27, 2023Updated 3 years ago