hu-po / streamdocsLinks
Documentation, notes, links, etc for streams.
ā83Updated last year
Alternatives and similar repositories for streamdocs
Users that are interested in streamdocs are comparing it to the libraries listed below
Sorting:
- documentation for content creationā219Updated last week
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration š¤ā272Updated 6 months ago
- Implementation of a framework for Genie2 in Pytorchā151Updated 7 months ago
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorchā278Updated last year
- This is the repository for the Photorealistic Unreal Graphics (PUG) datasets for representation learning.ā237Updated last year
- ā193Updated last year
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editingā69Updated last year
- PyTorch Implementation of Object Recognition as Next Token Prediction [CVPR'24 Highlight]ā180Updated 4 months ago
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"ā178Updated last year
- ā84Updated 11 months ago
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Modelsā278Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".ā103Updated last year
- Unofficial implementation and experiments related to Set-of-Mark (SoM) šļøā88Updated last year
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when nā¦ā43Updated 9 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.ā96Updated 8 months ago
- Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.ā122Updated 3 months ago
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"ā176Updated 6 months ago
- Official PyTorch implementation of TokenSet.ā121Updated 5 months ago
- 𤩠An AWESOME Curated List of Papers, Workshops, Datasets, and Challenges from CVPR 2024ā144Updated last year
- From scratch implementation of a vision language model in pure PyTorchā235Updated last year
- Focused on fast experimentation and simplicityā75Updated 8 months ago
- Implementation of the premier Text to Video model from OpenAIā56Updated 9 months ago
- Data release for the ImageInWords (IIW) paper.ā218Updated 9 months ago
- The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"ā246Updated 7 months ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.ā93Updated 3 months ago
- ā30Updated 10 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmindā56Updated 3 months ago
- [ICCV 2025] OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learningā291Updated 3 months ago
- ā298Updated 4 months ago
- Use Florence 2 to auto-label data for use in training fine-tuned object detection models.ā67Updated last year