hu-po / streamdocsLinks
Documentation, notes, links, etc for streams.
β84Updated last year
Alternatives and similar repositories for streamdocs
Users that are interested in streamdocs are comparing it to the libraries listed below
Sorting:
- documentation for content creationβ234Updated 3 months ago
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration π€β297Updated 10 months ago
- Implementation of a framework for Genie2 in Pytorchβ156Updated last year
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"β183Updated last year
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorchβ281Updated last year
- Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.β141Updated 7 months ago
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editingβ69Updated last year
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAIβ294Updated 7 months ago
- From scratch implementation of a vision language model in pure PyTorchβ252Updated last year
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Modelsβ280Updated last year
- Focused on fast experimentation and simplicityβ78Updated last year
- PyTorch Implementation of Object Recognition as Next Token Prediction [CVPR'24 Highlight]β181Updated 8 months ago
- β196Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.β97Updated last year
- Unofficial implementation and experiments related to Set-of-Mark (SoM) ποΈβ88Updated 2 years ago
- Implementation of a multimodal diffusion transformer in Pytorchβ107Updated last year
- β95Updated last year
- Data release for the ImageInWords (IIW) paper.β224Updated last year
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of newβ¦β126Updated last year
- Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".β141Updated last month
- A collection of tricks and tools to speed up transformer modelsβ194Updated 3 weeks ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.β134Updated 3 months ago
- Simple large-scale training of stable diffusion with multi-node support.β133Updated 2 years ago
- Implementation of the premier Text to Video model from OpenAIβ56Updated last year
- Implementation of the Llama architecture with RLHF + Q-learningβ168Updated 11 months ago
- Video-LlaVA fine-tune for CinePile evaluationβ51Updated last year
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when nβ¦β43Updated last year
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Modelsβ233Updated 2 months ago
- This is the repository for the Photorealistic Unreal Graphics (PUG) datasets for representation learning.β237Updated last year
- LLaVA-Interactive-Demoβ380Updated last year