Reproduction of the first step in the text-to-video model Phenaki. Code and model weights for the Transformer-based autoencoder for videos called CViViT.
☆29Aug 4, 2023Updated 2 years ago
Alternatives and similar repositories for phenaki-cvivit
Users that are interested in phenaki-cvivit are comparing it to the libraries listed below
Sorting:
- SFT+RL boosts multimodal reasoning☆46Jun 27, 2025Updated 8 months ago
- Unofficial implement of "Pix2seq: A Language Modeling Framework for Object Detection" on mmdetection☆33Apr 18, 2022Updated 3 years ago
- Official implementation of "OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes".☆90Jan 14, 2026Updated last month
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆86Jul 16, 2024Updated last year
- This repo consist of some experimental results on bdd100k datasets using different object detection algorithms(Faster-RCNN, FCOS, ATSS)☆11Jun 27, 2020Updated 5 years ago
- Region Proposal generation on images using clustering in Pointcloud - Currently only for Pedestrians☆11Jul 13, 2020Updated 5 years ago
- Neural Network Image Compression☆13Jan 12, 2018Updated 8 years ago
- Class materials, homeworks and videos for probation preparation.☆19Feb 3, 2026Updated last month
- ☆22Jan 12, 2026Updated last month
- ☆10Jan 20, 2021Updated 5 years ago
- This is an OCR program designed for travel document. It can now support 23 types of documents with pre-defined template. You can add what…☆10Nov 22, 2022Updated 3 years ago
- WebXR hand input in Three.JS example☆15Mar 22, 2025Updated 11 months ago
- ☆25Jul 28, 2025Updated 7 months ago
- A python wrapper for ScriptHookV☆11Jun 26, 2018Updated 7 years ago
- Unofficial Pytorch Implementation of "A Simple Framework for Contrastive Learning of Visual Representations"☆10Mar 11, 2020Updated 5 years ago
- A PyTorch re-implementation of Weakly Supervised Facial Action Unit Recognition through Adversarial Training☆10Apr 23, 2019Updated 6 years ago
- CBench, Benchmarking System for Question Answering Over Knowledge Graphs Systems.☆12Sep 16, 2022Updated 3 years ago
- A simply Python script to easily grab tags of an image on Danbooru☆10Mar 17, 2023Updated 2 years ago
- [NeurIPS 2024] Data exporter for SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset☆16Nov 8, 2024Updated last year
- ☆11Nov 10, 2023Updated 2 years ago
- Pytorch implementation of deep fill v2 (original by Jiayu et al.)☆10Jun 26, 2019Updated 6 years ago
- PyTorch Implementation of MobileDet (https://arxiv.org/abs/2004.14525v3) backbones.☆11Feb 12, 2024Updated 2 years ago
- RPIfield dataset for Person Re-identification☆13Aug 17, 2020Updated 5 years ago
- Can neural networks help bring a colourful life back to old black and white photos? Let's check it out!☆12May 23, 2018Updated 7 years ago
- An adapter layer that ensures torch_musa🔦 delivers a CUDA-compatible PyTorch experience.☆29Updated this week
- A collection of resources and papers on diffusion models of video generation.☆10Feb 11, 2023Updated 3 years ago
- 凌BUG-2021 robomaster国赛视觉组代码(非官方开源)☆11Jan 18, 2022Updated 4 years ago
- ☆13Apr 6, 2021Updated 4 years ago
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- ☆10May 31, 2018Updated 7 years ago
- [IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer☆107Mar 28, 2024Updated last year
- Repository for the paper CenterPoly: real-time instance segmentation using bounding polygons☆51Aug 24, 2021Updated 4 years ago
- [WACV2025] Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field☆14Nov 3, 2024Updated last year
- OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models☆19Feb 20, 2025Updated last year
- Framework to achieve context distillation in LLMs☆15Nov 24, 2023Updated 2 years ago
- ☆15Jan 8, 2020Updated 6 years ago
- iOS 빵집 사장님 들 가장장☆12Mar 4, 2025Updated last year
- Deformable 3D ConvNets for Action Recognition☆10Jan 21, 2018Updated 8 years ago
- Pruned and annotated union of LSUN and Stanford car datasets designed for usage in GAN training☆13May 7, 2021Updated 4 years ago