JuanFMontesinos / PyNVIdeoReader
GPU-accelerated video decoder
☆21Updated 3 years ago
Alternatives and similar repositories for PyNVIdeoReader:
Users that are interested in PyNVIdeoReader are comparing it to the libraries listed below
- [WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https…☆36Updated 2 years ago
- cuda implementation of depthwise conv3d☆22Updated 3 years ago
- [CVPR2023] Code for "Streaming Video Model"☆78Updated last year
- ☆34Updated 3 years ago
- Implementations of Transformers for Video☆23Updated 4 years ago
- ☆54Updated 2 years ago
- The 1st place solution of 2022 Ego4d Natural Language Queries.☆32Updated 2 years ago
- ☆72Updated last year
- ☆31Updated 2 years ago
- Code for the Video Similarity Challenge.☆77Updated last year
- Open-source code for Generic Grouping Network (GGN, CVPR 2022)☆111Updated last month
- Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 20…☆26Updated 3 years ago
- ☆52Updated 2 years ago
- Video action classification benchmark for common CNN architectures, implemented in PyTorch☆11Updated 3 years ago
- ☆54Updated 3 years ago
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated 3 months ago
- ☆44Updated 3 years ago
- ☆31Updated 3 years ago
- Rethinking Self-Supervised Correspondence Learning: A Video Frame-level Similarity Perspective, in ICCV 2021 (Oral)☆145Updated 3 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆34Updated 2 years ago
- [NeurIPS 2022] The official implementation of "Learning to Discover and Detect Objects".☆110Updated last year
- MIST: Multiple Instance Spatial Transformer☆25Updated 3 years ago
- Code Release for MeMViT Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition, CVPR 2022☆148Updated 2 years ago
- Masked Vision-Language Transformer in Fashion☆33Updated last year
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆33Updated 3 years ago
- An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"☆88Updated 6 months ago
- Learning Representational Invariances for Data-Efficient Action Recognition☆33Updated 3 years ago
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆68Updated 2 years ago
- ☆66Updated 2 years ago
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆88Updated 8 months ago