OliverRensu/D-iGPT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OliverRensu/D-iGPT)

OliverRensu / D-iGPT

[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Learners"

☆99

Alternatives and similar repositories for D-iGPT

Users that are interested in D-iGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lambert-x / ProLab
View on GitHub
Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…
☆55Aug 27, 2025Updated 10 months ago
OliverRensu / ARM
View on GitHub
[ICLR2025] This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision
☆91May 30, 2025Updated last year
TACJu / Axial-VS
View on GitHub
This repo contains the code for our TMLR paper: A Simple Video Segmenter by Tracking Objects Along Axial Trajectories
☆27Mar 20, 2025Updated last year
UCSC-VLAA / MixCon3D
View on GitHub
[CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"
☆35Apr 21, 2024Updated 2 years ago
ml-jku / MIM-Refiner
View on GitHub
A Contrastive Learning Boost from Intermediate Pre-Trained Representations
☆44Sep 19, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
wufeim / LychSim
View on GitHub
A controllable and interactive simulation framework for vision research.
☆16May 25, 2026Updated last month
UCSC-VLAA / MedVLThinker
View on GitHub
[ML4H'25] MedVLThinker: Simple Baselines for Multimodal Medical Reasoning
☆59Dec 21, 2025Updated 7 months ago
lambert-x / CateNorm
View on GitHub
The official implementation of "CateNorm: Categorical Normalization for Robust Medical Image Segmentation"
☆32Sep 30, 2022Updated 3 years ago
haojinw0027 / MedFrameQA
View on GitHub
MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning
☆18Jun 6, 2025Updated last year
OliverRensu / GRAT
View on GitHub
This repository includes the official implementation of our paper "Grouping First, Attending Smartly: Training-Free Acceleration for Diff…
☆56May 21, 2025Updated last year
amirbar / StoP
View on GitHub
☆12Jun 26, 2024Updated 2 years ago
Beckschen / ViTamin
View on GitHub
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
☆211Jun 9, 2024Updated 2 years ago
MrGiovanni / RT-Super
View on GitHub
[MICCAI 2026] A longitudinal, multimodal algorithm for multi-tumor segmentation (learning from reports).
☆15Jun 29, 2026Updated 3 weeks ago
UCSC-VLAA / MedVLSynther
View on GitHub
[ICLR'26] MedVLSynther: Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs
☆19Nov 1, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
aijinrjinr / MLB-Seg
View on GitHub
☆14Jul 2, 2024Updated 2 years ago
UCSC-VLAA / EVP
View on GitHub
[TMLR'24] This repository includes the official implementation our paper "Unleashing the Power of Visual Prompting At the Pixel Level"
☆42Apr 30, 2024Updated 2 years ago
elad-amrani / xtra
View on GitHub
PyTorch implementation of "Sample- and Parameter-Efficient Auto-Regressive Image Models" from CVPR 2025
☆14Nov 21, 2025Updated 8 months ago
Tiezheng11 / Vision-Language-Vision
View on GitHub
☆65Jul 11, 2025Updated last year
Shark0-0 / VG4D
View on GitHub
Implementation of the paper: VG4D: Vision-Language Model Goes 4D Video Recognition（ICRA 2024）
☆15Apr 23, 2024Updated 2 years ago
apple / ml-aim
View on GitHub
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
☆1,425Aug 4, 2025Updated 11 months ago
UCSC-VLAA / o1_medical
View on GitHub
☆48Feb 26, 2025Updated last year
MrGiovanni / LabelAssemble
View on GitHub
[ISBI 2023] Official Implementation for Label-Assemble
☆20Jul 30, 2024Updated last year
wangf3014 / Adventurer
View on GitHub
☆29Feb 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / long_seq_mae
View on GitHub
code release of research paper "Exploring Long-Sequence Masked Autoencoders"
☆100Oct 14, 2022Updated 3 years ago
Meituan-AutoML / VisionLLaMA
View on GitHub
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
☆392Jul 9, 2024Updated 2 years ago
OliverRensu / FlowAR
View on GitHub
“FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with an…
☆171May 1, 2025Updated last year
bytedance / ibot
View on GitHub
iBOT : Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
☆776Apr 14, 2022Updated 4 years ago
UCSC-VLAA / ClinSeekAgent
View on GitHub
☆28Jun 1, 2026Updated last month
UCSC-VLAA / Image-Pretraining-for-Video
View on GitHub
[ECCV 2022] This repository includes the official implementation our paper "In Defense of Image Pre-Training for Spatiotemporal Recogniti…
☆19Dec 22, 2022Updated 3 years ago
Huage001 / StyDeSty
View on GitHub
PyTorch implementation of paper "StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization" in ICML 2024.
☆16Jun 4, 2024Updated 2 years ago
showlab / cosmo
View on GitHub
☆75May 10, 2024Updated 2 years ago
AutoMedBench / AutoMedBench
View on GitHub
MedAutoBench — Medical AutoResearch Benchmark for Autonomous AI Agents
☆55Jul 9, 2026Updated last week
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
zhuangshaobin / WeTok
View on GitHub
[ICLR2026] WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction
☆69Sep 3, 2025Updated 10 months ago
wilson1yan / VideoGPT-Paper
View on GitHub
☆18Apr 15, 2021Updated 5 years ago
microsoft / x-reasoner
View on GitHub
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
☆49Feb 4, 2026Updated 5 months ago
TencentARC / SEED-Voken
View on GitHub
SEED-Voken: A Series of Powerful Visual Tokenizers
☆1,017Nov 25, 2025Updated 7 months ago
MrGiovanni / CARE
View on GitHub
[NeurIPS 2025] Completeness-Aware Reconstruction Enhancement
☆37Oct 18, 2025Updated 9 months ago
UCSC-VLAA / CRATE-alpha
View on GitHub
This repository includes the official implementation our paper "Scaling White-Box Transformers for Vision"
☆47Jun 3, 2024Updated 2 years ago
sail-sg / ScaleLong
View on GitHub
The official repository of paper "ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection" (N…
☆50Oct 23, 2023Updated 2 years ago