☆22Jun 6, 2025Updated 8 months ago
Alternatives and similar repositories for VLM-Video-Action-Localization
Users that are interested in VLM-Video-Action-Localization are comparing it to the libraries listed below
Sorting:
- (CVPR2024) Realigning Confidence with Temporal Saliency Information for Point-level Weakly-Supervised Temporal Action Localization☆20Jun 11, 2024Updated last year
- [ICCVW 2023] Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection☆21Feb 22, 2024Updated 2 years ago
- This repository lists some awesome public projects about Zero-shot/Few-shot Learning based on CLIP (Contrastive Language-Image Pre-Traini…☆27Nov 28, 2024Updated last year
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆18Jul 10, 2025Updated 7 months ago
- A PyTorch collection of semantic segmentation tools.☆32Mar 28, 2019Updated 6 years ago
- Part of a research scholarship. I built a basic 2d driving sim with simulated lidar data to train Deep Q Neural Network. So far after abo…☆11Feb 15, 2017Updated 9 years ago
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆21Jun 23, 2025Updated 8 months ago
- Arabic Handwritten Characters Dataset☆13Jun 22, 2017Updated 8 years ago
- NLP on Korean news articles. Automatic topic extraction through dynamic clustering.☆12Sep 15, 2017Updated 8 years ago
- This is a tool that can make you run intel openVINO Demos and samples easily.☆11Jan 31, 2023Updated 3 years ago
- PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k☆11Mar 14, 2024Updated last year
- ☆11Dec 6, 2024Updated last year
- A Kivy tutorial for PyOhio 2013☆14Apr 30, 2014Updated 11 years ago
- CLIP-based Adaptive Graph Attention Network for Large-Scale Unsupervised Multi-modal Hashing Retrieval☆10Mar 18, 2024Updated last year
- 海思设备上部署阉割版yolov5☆13Nov 22, 2021Updated 4 years ago
- MandelBulb rendered as a Point Cloud for IOS, uses Swift and Metal☆13May 31, 2021Updated 4 years ago
- 使用Qwen3的Embedding和Reranker模型实现查找与精排☆20Jun 22, 2025Updated 8 months ago
- ☆11Oct 27, 2017Updated 8 years ago
- Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆15Nov 18, 2025Updated 3 months ago
- Joint registration for multiple point clouds☆11Sep 9, 2024Updated last year
- An educational game. It won the 3rd place in the Kivy App Contest 2014.☆16Nov 19, 2024Updated last year
- An optimization based Cartesian controller for mobile manipulation of service robots in domestic environments☆10Oct 11, 2018Updated 7 years ago
- Your virtual companian/waifu powered by chatgpt and other state-of-the-art AI models☆11Sep 11, 2023Updated 2 years ago
- Gym wrapper for Vizdoom environments☆12Dec 14, 2018Updated 7 years ago
- Deep learning application☆10Dec 11, 2016Updated 9 years ago
- ☆12Aug 25, 2017Updated 8 years ago
- Create 3D point clouds from depth images captured with the lens blur feature of the Google Camera app for Android.☆19Apr 26, 2014Updated 11 years ago
- 针对常见的BAT公司中的大数据面试和笔试问题,列出解决思路,并使用python来实现☆11Aug 17, 2015Updated 10 years ago
- Face recognition using Siamese Networks☆12Nov 29, 2017Updated 8 years ago
- Weakly Supervised Referring Video Object Segmentation with Object-Centric Pseudo-Guidance☆10Aug 17, 2024Updated last year
- Interactive 3D Avatar Profile Viewer generated in Ready Player Me☆10Aug 27, 2022Updated 3 years ago
- A highly commented Tensorflow implementation of DCGAN and WGAN for images.☆10Dec 22, 2017Updated 8 years ago
- Simple rules based grapheme to phoneme in Python☆11Sep 2, 2017Updated 8 years ago
- ☆15Dec 2, 2025Updated 2 months ago
- A self contained python-based server that allows remote usage of OS X's text-to-speech abilities.☆22Jun 21, 2012Updated 13 years ago
- GLFW3 application☆14Jan 25, 2026Updated last month
- [ECCV 2024] The first zero-shot setting for spatio-temporal video grounding.☆11Jul 16, 2024Updated last year
- Deep Q-Networks in tensorflow☆10Apr 4, 2017Updated 8 years ago
- ☆10Mar 21, 2023Updated 2 years ago