Reproduction of the first step in the text-to-video model Phenaki. Code and model weights for the Transformer-based autoencoder for videos called CViViT.
☆29Aug 4, 2023Updated 2 years ago
Alternatives and similar repositories for phenaki-cvivit
Users that are interested in phenaki-cvivit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SFT+RL boosts multimodal reasoning☆47Jun 27, 2025Updated 9 months ago
- Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch☆793Jul 29, 2024Updated last year
- Implementation of MagViT2 Tokenizer in Pytorch☆660Jan 12, 2025Updated last year
- ☆25Jan 12, 2026Updated 2 months ago
- Region Proposal generation on images using clustering in Pointcloud - Currently only for Pedestrians☆11Jul 13, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code Guided Neural Style Transfer for Shape Stylization.☆11Jan 12, 2026Updated 2 months ago
- ☆131Feb 22, 2025Updated last year
- Unofficial implement of "Pix2seq: A Language Modeling Framework for Object Detection" on mmdetection☆34Apr 18, 2022Updated 3 years ago
- This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.☆12Jul 5, 2023Updated 2 years ago
- Main code of Dolphins dataset☆16Dec 29, 2022Updated 3 years ago
- python implementation of the paper 'Fast Range Image-Based Segmentation of Sparse 3D Laser Scans for Online Operation'☆12Jan 4, 2021Updated 5 years ago
- Implementing the paper☆15Nov 5, 2016Updated 9 years ago
- SEED-Voken: A Series of Powerful Visual Tokenizers☆1,001Nov 25, 2025Updated 4 months ago
- RAST 1.0: Restorable Arbitrary Style Transfer via Multi-restoration☆13Jun 18, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Official implementation for “Unsupervised Part Discovery via Dual Representation Alignment” - TPAMI 2024☆15Nov 6, 2024Updated last year
- Google MobileNets Implementation using Tensorflow☆18Jun 6, 2017Updated 8 years ago
- ALFASVMLib - A Matlab library for adversarial label flip attacks against SVMs☆13Jun 19, 2015Updated 10 years ago
- ☆161Nov 10, 2024Updated last year
- Annotated Tutorial for PerAct☆19Sep 11, 2023Updated 2 years ago
- A partial implementation of Generative Infinite Vocabulary Transformer (GIVT) from Google Deepmind, in PyTorch.☆21Mar 28, 2024Updated 2 years ago
- [IJCAI 2022 poster] PyTorch Implementation of "Universal Video Style Transfer via Crystallization, Separation, and Blending"☆17Mar 10, 2023Updated 3 years ago
- Official Pytorch code for "AesUST: Towards Aesthetic-Enhanced Universal Style Transfer" (ACM MM 2022)☆15Dec 31, 2022Updated 3 years ago
- A PyTorch re-implementation of Weakly Supervised Facial Action Unit Recognition through Adversarial Training☆10Apr 23, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- FR-TSVM☆12Nov 20, 2017Updated 8 years ago
- ☆10Jan 20, 2021Updated 5 years ago
- Unofficial Pytorch Implementation of "A Simple Framework for Contrastive Learning of Visual Representations"☆10Mar 11, 2020Updated 6 years ago
- Multi-temporal Scene dataset for Scene Change Detection.☆15Apr 14, 2021Updated 4 years ago
- Official JAX implementation of MAGVIT: Masked Generative Video Transformer☆997Jan 17, 2024Updated 2 years ago
- A list of robotics related papers accepted by ICLR'25☆25Aug 28, 2025Updated 7 months ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- ☆41Sep 21, 2023Updated 2 years ago
- [ICCV W] Contextual Convolutional Neural Networks (https://arxiv.org/pdf/2108.07387.pdf)☆14Aug 18, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Train a tiny LLaMA model from scratch to repeat your words using Reinforcement Learning from Human Feedback (RLHF)☆18May 23, 2024Updated last year
- Empowering Unified MLLM with Multi-granular Visual Generation☆130Jan 16, 2025Updated last year
- ☆17Oct 31, 2023Updated 2 years ago
- This is an OCR program designed for travel document. It can now support 23 types of documents with pre-defined template. You can add what…☆10Nov 22, 2022Updated 3 years ago
- using kd-trees☆12Apr 1, 2020Updated 6 years ago
- A script for spawning VSCode Remote server sessions on the TUoS HPC clusters.☆15Dec 12, 2024Updated last year
- [ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/To…☆155Jul 24, 2025Updated 8 months ago