yayafengzi/ALToLLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yayafengzi/ALToLLM)

yayafengzi / ALToLLM

ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation

☆30

Alternatives and similar repositories for ALToLLM

Users that are interested in ALToLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yayafengzi / LMM-HiMTok
View on GitHub
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
☆97Jul 17, 2025Updated last year
songw-zju / PixelThink
View on GitHub
The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (ICML 2026)
☆43Jul 4, 2026Updated 2 weeks ago
zhouyiks / CoLVA
View on GitHub
☆44Jul 9, 2025Updated last year
SilentView / EMCID
View on GitHub
Official Implementation for "Editing Massive Concepts in Text-to-Image Diffusion Models"
☆19Mar 21, 2024Updated 2 years ago
HKUST-LongGroup / STAMP
View on GitHub
[CVPR 2026] STAMP: Better, Stronger, Faster: Tackling the Trilemma in MLLM-based Segmentation with Simultaneous Textual Mask Prediction
☆39Feb 21, 2026Updated 5 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
eVI-group-SCU / Dr-Seg
View on GitHub
[CVPR'26] Dr. Seg: Revisiting GRPO Training for Visual Large Language Models through Perception-Oriented Design
☆31Mar 7, 2026Updated 4 months ago
hustvl / GroundingSuite
View on GitHub
[ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding
☆77Jun 26, 2025Updated last year
jdg900 / MMR
View on GitHub
[ICLR 2025] Official Pytorch Implementation of MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segm…
☆28Apr 3, 2025Updated last year
KaiyueSun98 / T2I-Personalization-with-AR
View on GitHub
☆47Apr 20, 2025Updated last year
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆634Jan 17, 2026Updated 6 months ago
congvvc / HyperSeg
View on GitHub
[CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".
☆182Dec 13, 2024Updated last year
MengLcool / SEGIC
View on GitHub
[ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".
☆27Oct 13, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
lxtGH / DenseWorld-1M
View on GitHub
Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"
☆129Oct 2, 2025Updated 9 months ago
rui-qian / UGround
View on GitHub
Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)
☆29Jun 18, 2026Updated last month
congvvc / InstructSeg
View on GitHub
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆56Feb 10, 2025Updated last year
Hanzy1996 / OpenSeg-R
View on GitHub
OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoning
☆29May 24, 2025Updated last year
hustvl / LENS
View on GitHub
[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
☆136Dec 3, 2025Updated 7 months ago
nnnth / UFO
View on GitHub
[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…
☆280Nov 5, 2025Updated 8 months ago
HKUST-LongGroup / DyME
View on GitHub
[ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆18Mar 18, 2026Updated 4 months ago
HuiGuanLab / RaTSG
View on GitHub
This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
☆13Aug 22, 2025Updated 11 months ago
SkyworkAI / DAQ-VS
View on GitHub
Code For Our Work: DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [ECCV-2024]
☆15Jul 11, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
geshang777 / Seg-R1
View on GitHub
[NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"
☆72Jul 1, 2025Updated last year
xuliu-cyber / RSUniVLM
View on GitHub
☆46Apr 16, 2026Updated 3 months ago
ml-research / deictic-segment-anything
View on GitHub
Segment Anything with Deictic Prompting
☆27May 13, 2025Updated last year
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
KaiyueSun98 / T2I-ReasonBench
View on GitHub
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
☆37Sep 16, 2025Updated 10 months ago
cvlab-kaist / Seg4Diff
View on GitHub
Official implementation of "Seg4Diff: Unveiling Open-Vocabulary Segmentation in Text-to-Image Diffusion Transformers" (NeurIPS 2025)
☆78Sep 23, 2025Updated 10 months ago
MaverickRen / PixelLM
View on GitHub
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆273Feb 11, 2025Updated last year
lisat-bair / LISAt_code
View on GitHub
☆30Sep 2, 2025Updated 10 months ago
godx-7 / DeH4R
View on GitHub
The official repo of the paper titled DeH4R: A Decoupled and Hybrid Method for Road Network Graph Extraction.
☆23May 25, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
berkeley-hipie / segllm
View on GitHub
Code release for "SegLLM: Multi-round Reasoning Segmentation"
☆129Feb 20, 2025Updated last year
zhang-tao-whu / P2PFormer
View on GitHub
☆32Dec 3, 2024Updated last year
letitiabanana / PnP-OVSS
View on GitHub
[CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
☆18Jul 22, 2024Updated 2 years ago
mc-lan / Text4Seg
View on GitHub
[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation
☆176Nov 8, 2025Updated 8 months ago
xushilin1 / dst-det
View on GitHub
[TCSVT] state-of-the-art open vocabulary detector on COCO/LVIS/V3Det
☆35Jun 3, 2025Updated last year
ydhongHIT / PlainSeg
View on GitHub
simple and efficient baselines for practical semantic segmentation with plain ViTs
☆20Mar 9, 2024Updated 2 years ago
MediaBrain-SJTU / MoLA
View on GitHub
☆21Jul 19, 2024Updated 2 years ago