fcakyon/video-transformers

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fcakyon/video-transformers)

fcakyon / video-transformers

Easiest way of fine-tuning HuggingFace video classification models

☆148

Alternatives and similar repositories for video-transformers

Users that are interested in video-transformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ilkerkesen / ViLMA
View on GitHub
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Jan 18, 2024Updated 2 years ago
mx-mark / VideoTransformer-pytorch
View on GitHub
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
☆306May 4, 2022Updated 4 years ago
florianHofherr / PhysParamInference
View on GitHub
☆19Jan 30, 2023Updated 3 years ago
rishikksh20 / ViViT-pytorch
View on GitHub
Implementation of ViViT: A Video Vision Transformer
☆559Jun 21, 2021Updated 5 years ago
md-mohaiminul / ViS4mer
View on GitHub
☆58Dec 2, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
MikeWangWZHL / Paxion
View on GitHub
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
☆38May 23, 2023Updated 3 years ago
NoManNayeem / Langchain_CrewAI_Gemini-AI_Agents
View on GitHub
Langchain_CrewAI_Gemini - An Gemini AI powered AI Agent (Multi-Agent) Project.
☆14Mar 24, 2024Updated 2 years ago
fmthoker / SEVERE-BENCHMARK
View on GitHub
☆26Aug 31, 2023Updated 2 years ago
Isaac-Flath / QuartoTemplates
View on GitHub
☆20Oct 3, 2022Updated 3 years ago
clovaai / webvicob
View on GitHub
Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023
☆110Oct 24, 2023Updated 2 years ago
tanyuqian / cappy
View on GitHub
NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
☆49Mar 29, 2024Updated 2 years ago
983632847 / HiCo
View on GitHub
This repository includes the code for HiCo (PyTorch version).
☆12Sep 24, 2022Updated 3 years ago
elb3k / vtn
View on GitHub
Video Transformer Network
☆41Jun 8, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
softbankrobotics-labs / pepper-deep-learning
View on GitHub
Object recognition with Pepper using a deep learning model
☆10Sep 16, 2021Updated 4 years ago
pritamqu / CrissCross
View on GitHub
[AAAI 2023 (Oral)] CrissCross: Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
☆26Jul 11, 2023Updated 3 years ago
fcakyon / balanced-loss
View on GitHub
Easy to use class balanced cross entropy and focal loss implementation for Pytorch
☆99Dec 17, 2024Updated last year
ivonajdenkoska / tulip
View on GitHub
[ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"
☆32Jan 26, 2026Updated 6 months ago
jinglescode / papers
View on GitHub
Summaries of machine learning papers
☆12Aug 19, 2022Updated 3 years ago
dreamy-xay / TableCenterNet
View on GitHub
The source code repository for the paper.
☆26Sep 8, 2025Updated 10 months ago
justliulong / OGHFYOLO
View on GitHub
The official code for "OG-HFYOLO :Orientation Gradient Guidance and Heterogeneous Feature Fusion For Deformation Table Cell Instance Segm…
☆13Jul 28, 2025Updated 11 months ago
guanxiongsun / STPN
View on GitHub
[ICCV2023] Spatio-temporal Prompting Network for Robust Video Feature Extraction
☆10Aug 17, 2023Updated 2 years ago
wangyz1999 / syncnet-speaker-diarization
View on GitHub
Identifying "who speak when" using visual speech input and pretrained lip-sync expert
☆18Jul 1, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
guybenyosef / EchoGraphs
View on GitHub
[MICCAI 2022] The official repository of Light-weight spatio-temporal graphs for segmentation and ejection fraction prediction in cardiac…
☆37Oct 2, 2023Updated 2 years ago
ashishpatel26 / IBM-Quantum-Challenge-Spring-2023-Challenge
View on GitHub
IBM Quantum Challenge Fall 2023
☆10May 23, 2023Updated 3 years ago
nuochenpku / COMEDY
View on GitHub
This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conver…
☆25Nov 18, 2024Updated last year
facebookresearch / pytorchvideo
View on GitHub
A deep learning library for video understanding research.
☆3,565May 5, 2026Updated 2 months ago
IMLHF / SpecAugmentPyTorch
View on GitHub
A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech…
☆11Jul 24, 2024Updated 2 years ago
badripatro / Awesome-Mamba-360
View on GitHub
☆22May 13, 2024Updated 2 years ago
MCG-NJU / VideoMAE
View on GitHub
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,775Dec 8, 2023Updated 2 years ago
sangho-vision / avbert
View on GitHub
☆31Sep 20, 2021Updated 4 years ago
shah-deven / Action-Classification-using-CNN-and-LSTM
View on GitHub
Action Classification using CNN and LSTM
☆12Jan 17, 2019Updated 7 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Adit31 / Captionomaly-Deep-Learning-Toolbox-for-Anomaly-Captioning
View on GitHub
Source Code for Captionomaly: A Deep Learning Toolbox for Anomaly Captioning in Surveillance Videos
☆13Jun 26, 2023Updated 3 years ago
voxel51 / fiftyone-examples
View on GitHub
Examples of using FiftyOne
☆234Mar 23, 2026Updated 4 months ago
ajjimeno / icdar-task-b
View on GitHub
Repo
☆13Mar 7, 2022Updated 4 years ago
GoodNotes / GNHK-dataset
View on GitHub
☆19Mar 28, 2022Updated 4 years ago
1adrianb / video-transformers
View on GitHub
☆47Apr 14, 2022Updated 4 years ago
Nyandwi / MultiModal-Learning-Research
View on GitHub
A curated resources on what's happening in multimodal learning. Features recent papers, books, related lectures, and other relevant resou…
☆16Apr 28, 2023Updated 3 years ago
HHTseng / video-classification
View on GitHub
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
☆970Dec 7, 2020Updated 5 years ago