NVlabs / PS3Links
Scaling Vision Pre-Training to 4K Resolution
☆218Updated 3 weeks ago
Alternatives and similar repositories for PS3
Users that are interested in PS3 are comparing it to the libraries listed below
Sorting:
- Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".☆234Updated 9 months ago
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆199Updated 8 months ago
- Code for the Molmo2 Vision-Language Model☆117Updated last month
- Pixio: a capable vision encoder dedicated to dense prediction, simply by pixel reconstruction☆339Updated this week
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation