Easiest way of fine-tuning HuggingFace video classification models
☆148Mar 20, 2023Updated 3 years ago
Alternatives and similar repositories for video-transformers
Users that are interested in video-transformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Jan 18, 2024Updated 2 years ago
- PyTorch implementation of a collections of scalable Video Transformer Benchmarks.☆306May 4, 2022Updated 3 years ago
- ☆19Jan 30, 2023Updated 3 years ago
- Object detection and instance segmentation on MaskRCNN with torchvision, albumentations, tensorboard and cocoapi. Supports custom coco da…☆18Sep 28, 2020Updated 5 years ago
- Effective frame sampling for ML applications.☆25Aug 30, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Implementation of ViViT: A Video Vision Transformer☆557Jun 21, 2021Updated 4 years ago
- ☆58Dec 2, 2025Updated 3 months ago
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- ☆26Aug 31, 2023Updated 2 years ago
- Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)☆21Jul 16, 2025Updated 8 months ago
- Langchain_CrewAI_Gemini - An Gemini AI powered AI Agent (Multi-Agent) Project.☆13Mar 24, 2024Updated 2 years ago
- Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023☆109Oct 24, 2023Updated 2 years ago
- ☆20Oct 3, 2022Updated 3 years ago
- collection of pitch (f0, fundamental frequency) detection algorithms with unified interface☆25Nov 25, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Jun 15, 2023Updated 2 years ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆46Mar 29, 2024Updated last year
- ☆13Jul 20, 2024Updated last year
- ☆14Jul 2, 2024Updated last year
- Object recognition with Pepper using a deep learning model☆10Sep 16, 2021Updated 4 years ago
- This repo is deprecated. Please refer to new up-to-date repo: https://github.com/fcakyon/craft-text-detector☆12Apr 21, 2020Updated 5 years ago
- Video Transformer Network☆41Jun 8, 2021Updated 4 years ago
- This repository includes the code for HiCo (PyTorch version).☆11Sep 24, 2022Updated 3 years ago
- [ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"☆33Jan 26, 2026Updated 2 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Easy to use class balanced cross entropy and focal loss implementation for Pytorch☆98Dec 17, 2024Updated last year
- Vision-Language Pre-Training for Boosting Scene Text Detectors (CVPR2022)☆12Mar 21, 2022Updated 4 years ago
- Official repository of "Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning", ICCV 2021☆17Aug 4, 2023Updated 2 years ago
- Summaries of machine learning papers☆12Aug 19, 2022Updated 3 years ago
- Utilities for working with videos☆13Jul 5, 2025Updated 8 months ago
- A guide to structured generation using constrained decoding☆14Jun 9, 2024Updated last year
- IBM Quantum Challenge Fall 2023☆10May 23, 2023Updated 2 years ago
- [ICCV2023] Spatio-temporal Prompting Network for Robust Video Feature Extraction☆10Aug 17, 2023Updated 2 years ago
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Python tools☆14Oct 22, 2023Updated 2 years ago
- [MICCAI 2022] The official repository of Light-weight spatio-temporal graphs for segmentation and ejection fraction prediction in cardiac…☆38Oct 2, 2023Updated 2 years ago
- A deep learning library for video understanding research.☆3,550Jan 12, 2026Updated 2 months ago
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆23Nov 8, 2023Updated 2 years ago
- A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech…☆12Jul 24, 2024Updated last year
- ☆21May 13, 2024Updated last year
- A generalized Chamfer distance implementation in CUDA/Pytorch☆11Sep 17, 2020Updated 5 years ago