hu-po / streamdocs
Documentation, notes, links, etc for streams.
☆78Updated last year
Alternatives and similar repositories for streamdocs:
Users that are interested in streamdocs are comparing it to the libraries listed below
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆47Updated last month
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆170Updated 9 months ago
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆67Updated 10 months ago
- Official PyTorch implementation of TokenSet.☆88Updated last week
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆111Updated last month
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆207Updated 3 weeks ago
- ☆79Updated 4 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆90Updated 3 months ago
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆175Updated 3 months ago
- Focused on fast experimentation and simplicity☆70Updated 3 months ago
- ☆67Updated 8 months ago
- ☆190Updated last year
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆85Updated last year
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆121Updated 7 months ago
- ☆73Updated 6 months ago
- Implementation of a framework for Genie2 in Pytorch☆144Updated 2 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated 11 months ago
- Train VAE like a boss☆270Updated 5 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆101Updated 9 months ago
- [arXiv] On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices☆104Updated last month
- From scratch implementation of a vision language model in pure PyTorch☆205Updated 10 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆150Updated 3 months ago
- Collection of autoregressive model implementation☆83Updated last month
- Code release for "LLMs can see and hear without any training"☆231Updated last month
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated 9 months ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆80Updated 9 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆42Updated 4 months ago
- Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"☆79Updated last week
- Recaption large (Web)Datasets with vllm and save the artifacts.☆48Updated 4 months ago
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗☆223Updated last month