Variante / video-postproc-toolboxLinks

针对新的视频后期工作流制作的各种小工具

☆20

Alternatives and similar repositories for video-postproc-toolbox

Users that are interested in video-postproc-toolbox are comparing it to the libraries listed below

Sorting:

minglllli / CLS-RL
Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
☆47Updated 2 weeks ago
OpenGVLab / LCL
[NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
☆69Updated 3 months ago
WillDreamer / Aurora
[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
☆86Updated last year
wutaiqiang / MoSLoRA
☆105Updated 11 months ago
HC-Guo / Awesome-Multimodal-Chain-of-Thought
Collection of papers and repos for multimodal chain-of-thought
☆83Updated 7 months ago
heliossun / SQ-LLaVA
Visual self-questioning for large vision-language assistant.
☆41Updated 8 months ago
x-cls / superclass
[NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training
☆209Updated 2 months ago
contrastive / FreeVideoLLM
☆76Updated 7 months ago
ZhengYu518 / VL-Mamba
Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"
☆81Updated last year
zh460045050 / V2L-Tokenizer
☆133Updated 11 months ago
ShoufaChen / Awesome-Diffusion-Transformers
https://www.shoufachen.com/Awesome-Diffusion-Transformers/
☆142Updated last year
justchenhao / ChatDailyPapers
Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…
☆38Updated 2 years ago
swordlidev / Evaluation-Multimodal-LLMs-Survey
A Survey on Benchmarks of Multimodal Large Language Models
☆104Updated 2 months ago
invictus717 / MiCo
Explore the Limits of Omni-modal Pretraining at Scale
☆102Updated 9 months ago
Kevinz-code / SeVa
[MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501
☆55Updated 10 months ago
alibaba / conv-llava
☆115Updated 10 months ago
hunto / DiffKD
Official implementation for paper "Knowledge Diffusion for Distillation", NeurIPS 2023
☆86Updated last year
TencentARC / ViT-Lens
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
☆176Updated 4 months ago
bojone / FSQ
Keras implement of Finite Scalar Quantization
☆73Updated last year
MM-LLMs / mm-llms.github.io
☆31Updated 5 months ago
foundation-multimodal-models / CAL
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
☆56Updated 8 months ago
yayafengzi / LMM-HiMTok
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
☆40Updated 2 weeks ago
liuzhuang13 / bias
☆109Updated last year
ggjy / DeLVM
☆117Updated last year
JiuTian-VL / MoME
[NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
☆65Updated last month
360CVGroup / Inner-Adaptor-Architecture
LMM solved catastrophic forgetting, AAAI2025
☆43Updated last month
BAAI-DCAI / DataOptim
A collection of visual instruction tuning datasets.
☆76Updated last year
xinke-wang / ModaVerse
[CVPR2024] ModaVerse: Efficiently Transforming Modalities with LLMs
☆29Updated 11 months ago
UMass-Embodied-AGI / Mod-Squad
☆87Updated 2 years ago
TempleX98 / MoVA
[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context
☆156Updated 8 months ago