hu-po / streamdocsLinks
Documentation, notes, links, etc for streams.
β84Updated last year
Alternatives and similar repositories for streamdocs
Users that are interested in streamdocs are comparing it to the libraries listed below
Sorting:
- documentation for content creationβ230Updated last month
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration π€β287Updated 9 months ago
- Implementation of a framework for Genie2 in Pytorchβ153Updated 10 months ago
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"β181Updated last year
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorchβ281Updated last year
- Unofficial implementation and experiments related to Set-of-Mark (SoM) ποΈβ88Updated 2 years ago
- Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.β137Updated 6 months ago
- β195Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.β96Updated 11 months ago
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Modelsβ280Updated last year
- This is the repository for the Photorealistic Unreal Graphics (PUG) datasets for representation learning.β237Updated last year
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editingβ69Updated last year
- From scratch implementation of a vision language model in pure PyTorchβ251Updated last year
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmindβ57Updated 6 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Modelsβ231Updated last month
- A collection of tricks and tools to speed up transformer modelsβ189Updated last month
- PyTorch Implementation of Object Recognition as Next Token Prediction [CVPR'24 Highlight]β181Updated 6 months ago
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"β201Updated 9 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when nβ¦β43Updated last year
- [ICCV 2025] OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learningβ407Updated 2 months ago
- β135Updated 2 months ago
- Implementation of the premier Text to Video model from OpenAIβ56Updated last year
- β30Updated last year
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)β162Updated 7 months ago
- β92Updated last year
- Official PyTorch implementation of TokenSet.β127Updated 8 months ago
- The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"β31Updated last month
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the bestβ¦β54Updated 8 months ago
- The official repository for HyperZβ Zβ W Operator Connects Slow-Fast Networks for Full Context Interaction.β43Updated 7 months ago
- Data release for the ImageInWords (IIW) paper.β223Updated last year