hu-po / streamdocs
Documentation, notes, links, etc for streams.
☆78Updated last year
Alternatives and similar repositories for streamdocs:
Users that are interested in streamdocs are comparing it to the libraries listed below
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆176Updated this week
- Scaling Vision Pre-Training to 4K Resolution☆154Updated this week
- Timm model explorer☆39Updated last year
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗☆231Updated 2 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆86Updated last year
- ☆192Updated last year
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆172Updated 10 months ago
- ☆30Updated 6 months ago
- 🤩 An AWESOME Curated List of Papers, Workshops, Datasets, and Challenges from CVPR 2024☆142Updated 10 months ago
- documentation for content creation☆194Updated 2 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆215Updated this week
- My take on Flow Matching☆52Updated 3 months ago
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models☆278Updated last year
- ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing☆67Updated 11 months ago
- Data release for the ImageInWords (IIW) paper.☆209Updated 5 months ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 7 months ago
- Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"☆101Updated 3 weeks ago
- Implementation of a framework for Genie2 in Pytorch☆145Updated 3 months ago
- From scratch implementation of a vision language model in pure PyTorch☆214Updated last year
- Train VAE like a boss☆276Updated 6 months ago
- Focused on fast experimentation and simplicity☆71Updated 4 months ago
- σ-GPT: A New Approach to Autoregressive Models☆63Updated 8 months ago
- ☆67Updated 9 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆42Updated 5 months ago
- ☆74Updated 7 months ago
- Summarize any Arixv Paper with ease☆63Updated last year
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆91Updated 4 months ago
- Implementation of the premier Text to Video model from OpenAI☆57Updated 5 months ago
- Official PyTorch implementation of TokenSet.☆116Updated last month
- [NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim☆328Updated 2 months ago