Ali2500/ViCaS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Ali2500/ViCaS)

Ali2500 / ViCaS

ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation (CVPR'25)

☆21

Alternatives and similar repositories for ViCaS

Users that are interested in ViCaS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dlsrbgg33 / Video-3DGS
View on GitHub
☆28Apr 4, 2025Updated last year
ByteDance-Seed / DeepFlow
View on GitHub
[ICCV 2025] Deeply Supervised Flow-Based Generative Models
☆38Jun 26, 2025Updated last year
TACJu / Axial-VS
View on GitHub
This repo contains the code for our TMLR paper: A Simple Video Segmenter by Tracking Objects Along Axial Trajectories
☆27Mar 20, 2025Updated last year
qihao067 / DiMR
View on GitHub
[NeurIPS 24] Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
☆44Sep 30, 2024Updated last year
OliverRensu / FreqFlow
View on GitHub
The official implementation of "Frequency-Aware Flow Matching for High-Quality Image Generation"
☆29Apr 20, 2026Updated 3 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
amazon-far / BAR
View on GitHub
[ICML 2026] code & model for arxiv paper "Autoregressive Image Generation with Masked Bit Modeling"
☆59May 1, 2026Updated 2 months ago
markweberdev / maskbit
View on GitHub
Implementation of the paper "MaskBit: Embedding-free Image Generation from Bit Tokens"
☆94Apr 10, 2025Updated last year
kumuji / Sa2VA-i
View on GitHub
Sa2VA-i is an improved version of the popular Sa2VA model
☆16Nov 25, 2025Updated 7 months ago
Hansxsourse / VRMDiff
View on GitHub
☆11Mar 11, 2025Updated last year
Ali2500 / TarViS
View on GitHub
☆50Jul 5, 2023Updated 3 years ago
tue-mps / simple-tad
View on GitHub
[ICCVW 2025] Simplifying Traffic Anomaly Detection with Video Foundation Models
☆18Dec 4, 2025Updated 7 months ago
TACJu / FlowTok
View on GitHub
PyTorch re-implementation of FlowTok: Flowing Seamlessly Across Text and Image Tokens
☆17Nov 26, 2025Updated 7 months ago
OliverRensu / GRAT
View on GitHub
This repository includes the official implementation of our paper "Grouping First, Attending Smartly: Training-Free Acceleration for Diff…
☆56May 21, 2025Updated last year
YangLiu14 / Open-World-Tracking
View on GitHub
Official code for "Opening up Open World Tracking" (CVPR 2022)
☆56Apr 8, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
bytedance / OmniScient-Model
View on GitHub
This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model
☆102Jul 15, 2024Updated 2 years ago
Schmiddo / d2conv3d
View on GitHub
D²Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos
☆31Aug 2, 2022Updated 3 years ago
congvvc / InstructSeg
View on GitHub
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆56Feb 10, 2025Updated last year
MrGiovanni / LabelAssemble
View on GitHub
[ISBI 2023] Official Implementation for Label-Assemble
☆20Jul 30, 2024Updated last year
dengandong / GroundMoRe
View on GitHub
☆18May 18, 2026Updated 2 months ago
cilinyan / ReVOS-api
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆22Jul 20, 2024Updated 2 years ago
ytaek-oh / vl_compo
View on GitHub
☆10Jul 5, 2024Updated 2 years ago
RobertLuo1 / iccv2023_RVOS_Challenge
View on GitHub
[ICCV 2023 Workshop] The Official Implementation of The First Prize Solution for RVOS Competition
☆14Jan 1, 2024Updated 2 years ago
sunye23 / SAMA
View on GitHub
[NeurIPS 2025] SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models.
☆17May 26, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LarsKreuzberg / 4D-StOP
View on GitHub
Official code for "4D-StOP: Panoptic Segmentation of 4D LiDAR using Spatio-temporal Object Proposal Generation and Aggregation"
☆28Nov 15, 2022Updated 3 years ago
yucornetto / GG-Transformer
View on GitHub
Code and models for the paper Glance-and-Gaze Vision Transformer
☆28Jun 7, 2021Updated 5 years ago
sapeirone / EgoPack
View on GitHub
Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…
☆24Jun 13, 2024Updated 2 years ago
kumuji / stu_dataset
View on GitHub
[CVPR 2025] Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving
☆47Nov 20, 2025Updated 8 months ago
wanglu-cs / Think_While_Watching
View on GitHub
☆19Jun 26, 2026Updated 3 weeks ago
mbzuai-oryx / VideoMolmo
View on GitHub
Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"
☆56Jul 5, 2025Updated last year
HumanMLLM / LOVE-R1
View on GitHub
Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"
☆24Nov 1, 2025Updated 8 months ago
tue-mps / algm-segmenter
View on GitHub
ALGM applied to Segmenter
☆33May 27, 2024Updated 2 years ago
wdrink / ARM
View on GitHub
ARM: An AutoRegressive Large Multimodal Model with Discrete Representations
☆50Jun 10, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
rkzheng99 / ViLLa
View on GitHub
Video Reasoning Segmentation
☆26Nov 29, 2024Updated last year
bytedance / coconut_cvpr2024
View on GitHub
☆206Apr 21, 2026Updated 3 months ago
jzh15 / SpatialStack
View on GitHub
[CVPR 2026]SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning
☆31Jul 15, 2026Updated last week
FocoosAI / papers
View on GitHub
List of papers wrote by Focoos AI research team!
☆12Jun 3, 2025Updated last year
nishantrai18 / cocon
View on GitHub
CoCon: Cooperative Contrastive Learning
☆20Nov 5, 2022Updated 3 years ago
cloneofsimo / imagenet.int8
View on GitHub
☆40Apr 27, 2024Updated 2 years ago
FudanCVL / SAM-MT
View on GitHub
[ECCV 2026] Real-Time Interactive Multi-Target Video Segmentation
☆52Jul 10, 2026Updated last week