hu-po / streamdocsLinks
Documentation, notes, links, etc for streams.
☆79Updated last year
Alternatives and similar repositories for streamdocs
Users that are interested in streamdocs are comparing it to the libraries listed below
Sorting:
- Implementation of the premier Text to Video model from OpenAI☆56Updated 6 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆52Updated 6 months ago
- Official PyTorch implementation of TokenSet.☆121Updated 2 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆93Updated 5 months ago
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗☆237Updated 3 months ago
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆178Updated last month
- Implementation of a framework for Genie2 in Pytorch☆148Updated 5 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated last year
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆156Updated last month
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆173Updated 11 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆44Updated 2 months ago
- [ICLR 2025] Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stit…☆102Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated 11 months ago
- ☆47Updated 3 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆42Updated 6 months ago
- documentation for content creation☆199Updated this week
- Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research☆50Updated 4 months ago
- VIT inference in triton because, why not?☆28Updated last year
- Focused on fast experimentation and simplicity☆73Updated 5 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆221Updated last month
- ☆68Updated 10 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆51Updated last week
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆69Updated last year
- OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning☆237Updated 3 weeks ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 8 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆59Updated 3 months ago
- Implementation of the proposed MaskBit from Bytedance AI☆80Updated 6 months ago
- Collection of autoregressive model implementation☆85Updated last month
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆129Updated last month
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆75Updated 5 months ago