hu-po / streamdocsLinks
Documentation, notes, links, etc for streams.
☆80Updated last year
Alternatives and similar repositories for streamdocs
Users that are interested in streamdocs are comparing it to the libraries listed below
Sorting:
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆93Updated 6 months ago
- PyTorch Implementation of Object Recognition as Next Token Prediction [CVPR 2024 Highlight]☆180Updated last month
- ☆64Updated 2 months ago
- Official PyTorch implementation of TokenSet.☆121Updated 3 months ago
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆174Updated last year
- Implementation of a framework for Genie2 in Pytorch☆150Updated 5 months ago
- documentation for content creation☆203Updated last week
- ☆68Updated 11 months ago
- From scratch implementation of a vision language model in pure PyTorch☆225Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated last year
- A Video Tokenizer Evaluation Dataset☆126Updated 5 months ago
- LoRA and DoRA from Scratch Implementations☆204Updated last year
- Simple large-scale training of stable diffusion with multi-node support.☆133Updated 2 years ago
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch☆277Updated 11 months ago
- Timm model explorer☆40Updated last year
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆53Updated 3 weeks ago
- My take on Flow Matching☆64Updated 5 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆158Updated 2 months ago
- VIT inference in triton because, why not?☆29Updated last year
- Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".☆217Updated 2 months ago
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆69Updated last year
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆164Updated 4 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆86Updated last year
- Focused on fast experimentation and simplicity☆75Updated 6 months ago
- 🤩 An AWESOME Curated List of Papers, Workshops, Datasets, and Challenges from CVPR 2024☆142Updated last year
- Inference-time scaling of diffusion-based image and video generation models.☆151Updated 3 months ago
- ☆77Updated 9 months ago
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models☆278Updated last year
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆43Updated 7 months ago
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗☆250Updated 4 months ago