lorenmt / clarity-templateLinks
Clarity: A Minimalist Website Template for AI Research
☆119Updated 4 months ago
Alternatives and similar repositories for clarity-template
Users that are interested in clarity-template are comparing it to the libraries listed below
Sorting:
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆120Updated 3 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆69Updated 7 months ago
- ☆129Updated 5 months ago
- Code release for paper "Test-Time Training Done Right"☆103Updated this week
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆50Updated 10 months ago
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆129Updated last month
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆178Updated last month
- Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"☆118Updated 4 months ago
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Length☆143Updated 2 weeks ago
- Benchmarking physical understanding in generative video models☆168Updated 2 weeks ago
- An ML research template with good documentation by Boyuan Chen, an MIT PhD student☆72Updated 3 months ago
- [Preprint] UCGM: Unified Continuous Generative Models☆133Updated last week
- This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …☆88Updated last year
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆87Updated last year
- Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"☆106Updated last month
- ☆69Updated 2 months ago
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆69Updated 7 months ago
- [ICML 2025] Gaussian Mixture Flow Matching Models (GMFlow)☆97Updated last week
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆61Updated 2 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆125Updated 11 months ago
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆173Updated 11 months ago
- Code for paper "Principal Components" Enable A New Language of Images☆41Updated last month
- [CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Mo…☆188Updated 2 months ago
- ☆163Updated 5 months ago
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆75Updated 5 months ago
- [CVPR 2024] Probing the 3D Awareness of Visual Foundation Models☆311Updated 10 months ago
- Official repository of paper "Subobject-level Image Tokenization"☆73Updated 2 months ago
- Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".☆211Updated 2 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆74Updated 3 months ago
- [ICML 2025] Implementation of Spatial Reasoning with Denoising Models☆37Updated 2 weeks ago