hu-po / streamdocs
Documentation, notes, links, etc for streams.
☆77Updated last year
Alternatives and similar repositories for streamdocs:
Users that are interested in streamdocs are comparing it to the libraries listed below
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆167Updated 8 months ago
- From scratch implementation of a vision language model in pure PyTorch☆200Updated 10 months ago
- Implementation of the premier Text to Video model from OpenAI☆57Updated 4 months ago
- A Video Tokenizer Evaluation Dataset☆104Updated last month
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆85Updated last year
- Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".☆199Updated 3 months ago
- ☆68Updated 8 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆243Updated last month
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆174Updated 2 months ago
- ☆67Updated 8 months ago
- Timm model explorer☆38Updated 11 months ago
- ☆70Updated 5 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆89Updated 2 months ago
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch☆269Updated 7 months ago
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗☆216Updated 2 weeks ago
- Python Library to evaluate VLM models' robustness across diverse benchmarks☆194Updated last week
- ☆166Updated 4 months ago
- Data release for the ImageInWords (IIW) paper.☆208Updated 3 months ago
- a family of highly capabale yet efficient large multimodal models☆177Updated 6 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆149Updated 2 months ago
- ☆189Updated last year
- Implementation of a framework for Genie2 in Pytorch☆143Updated 2 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆47Updated 3 months ago
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated 9 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆204Updated last week
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆39Updated 4 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆120Updated 7 months ago
- 🤩 An AWESOME Curated List of Papers, Workshops, Datasets, and Challenges from CVPR 2024☆143Updated 8 months ago