JuanFMontesinos / PyNVIdeoReaderLinks
GPU-accelerated video decoder
☆20Updated 4 years ago
Alternatives and similar repositories for PyNVIdeoReader
Users that are interested in PyNVIdeoReader are comparing it to the libraries listed below
Sorting:
- [CVPR2023] Code for "Streaming Video Model"☆78Updated 2 years ago
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆90Updated 2 months ago
- [WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https…☆36Updated 2 years ago
- [WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling☆53Updated last month
- t-vMF Similarity for Regularizing Intra-Class Feature Distribution☆21Updated 4 years ago
- ☆55Updated 2 years ago
- We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances…☆47Updated 3 years ago
- Official Code of ICCV 2021 Paper: Learning to Cut by Watching Movies☆51Updated 2 years ago
- ☆8Updated 2 years ago
- Video action classification benchmark for common CNN architectures, implemented in PyTorch☆11Updated 3 years ago
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆68Updated 2 years ago
- Learning Representational Invariances for Data-Efficient Action Recognition☆33Updated 3 years ago
- Implementations of Transformers for Video☆23Updated 4 years ago
- Masked Vision-Language Transformer in Fashion☆33Updated last year
- Code for Temporal Data Augmentations (ECCVW 2020)☆37Updated 4 years ago
- ☆17Updated 2 years ago
- ☆29Updated last year
- cuda implementation of depthwise conv3d☆22Updated 3 years ago
- ViT trained on COYO-Labeled-300M dataset☆32Updated 2 years ago
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆35Updated 3 years ago
- Official implementation of AdaMML. https://arxiv.org/abs/2105.05165.☆51Updated 3 years ago
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Updated 3 years ago
- Code for the Video Similarity Challenge.☆81Updated last year
- MIST: Multiple Instance Spatial Transformer☆25Updated 3 years ago
- A library of transformer models for computer vision and multi-modality research☆49Updated 3 years ago
- HIRL: A General Framework for Hierarchical Image Representation Learning (http://arxiv.org/abs/2205.13159)☆39Updated 3 years ago
- Graph learning framework for long-term video understanding☆65Updated 3 weeks ago
- This is a offical PyTorch/GPU implementation of SupMAE.☆78Updated 2 years ago
- [CVPRW'23] The official PyTorch implementation of NamedMask☆23Updated 2 years ago
- [NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images☆58Updated 3 years ago