huggingface / fineVideo
☆64Updated 4 months ago
Alternatives and similar repositories for fineVideo:
Users that are interested in fineVideo are comparing it to the libraries listed below
- Video-LlaVA fine-tune for CinePile evaluation☆46Updated 5 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆44Updated last month
- M4 experiment logbook☆56Updated last year
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆47Updated last month
- Focused on fast experimentation and simplicity☆64Updated 3 weeks ago
- ☆62Updated 3 months ago
- Train, tune, and infer Bamba model☆76Updated this week
- LL3M: Large Language and Multi-Modal Model in Jax☆68Updated 8 months ago
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆165Updated 6 months ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆155Updated 9 months ago
- ☆62Updated 5 months ago
- 🦾 EvalGIM (pronounced as "EvalGym") is an evaluation library for generative image models. It enables easy-to-use, reproducible automatic…☆62Updated 3 weeks ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆100Updated last week
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆148Updated last month
- ☆69Updated 5 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆145Updated 2 weeks ago
- Implementation of the proposed MaskBit from Bytedance AI☆71Updated 2 months ago
- Multimodal language model benchmark, featuring challenging examples☆158Updated last month
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆30Updated 6 months ago
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆70Updated 2 months ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆59Updated 4 months ago
- Implementation of the Mamba SSM with hf_integration.☆56Updated 4 months ago
- Collection of autoregressive model implementation☆76Updated last week
- When it comes to optimizers, it's always better to be safe than sorry☆157Updated this week
- ☆65Updated 6 months ago
- Data release for the ImageInWords (IIW) paper.☆205Updated 2 months ago
- Implementation of the premier Text to Video model from OpenAI☆57Updated 2 months ago
- Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image …☆62Updated last month