awsaf49 / flickr-dataset
Download flickr8k, flickr30k image caption datasets
☆13Updated 11 months ago
Alternatives and similar repositories for flickr-dataset:
Users that are interested in flickr-dataset are comparing it to the libraries listed below
- Task Agnostic Unsupervised Learning☆15Updated 3 years ago
- TensorFlow implementation of GhostNet: More Features from Cheap Operations.☆10Updated 4 years ago
- Multi-label classification based on timm, and add SimCLR to timm.☆37Updated 3 years ago
- State-of-the-art data augmentation search algorithms in PyTorch☆47Updated last year
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆67Updated 2 years ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆16Updated 3 months ago
- Clipora is a powerful toolkit for fine-tuning OpenCLIP models using Low Rank Adapters (LoRA).☆19Updated 5 months ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Updated 2 years ago
- Fine Grained Visual Classification☆9Updated 2 years ago
- PyTorch implementation of STR models for transfer learning in Indic Languages☆16Updated 3 years ago
- Library for converting from RGB / GrayScale image to base64 and back.☆19Updated 2 years ago
- A Keras implementation of hybrid efficientnet swin transformer model.☆32Updated last year
- TF 2 implementation Learning to Resize Images for Computer Vision Tasks (https://arxiv.org/abs/2103.09950v1).☆52Updated 3 years ago
- ☆12Updated 2 years ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆32Updated 9 months ago
- My 1st place solution at Kaggle Hotel-ID 2021☆18Updated 3 years ago
- Official code repository for the WACV 2022 paper "Visualizing Paired Image Similarity in Transformer Networks"☆20Updated 2 years ago
- Implementing DropPath/StochasticDepth in PyTorch☆16Updated 2 years ago
- Masked Vision-Language Transformer in Fashion☆33Updated last year
- Estimate dataset difficulty and detect label mistakes using reconstruction error ratios!☆17Updated last week
- image captioning paper list☆8Updated 5 years ago
- ☆17Updated 4 years ago
- Deploy Swin Transformer using TorchServe☆27Updated 3 years ago
- ☆11Updated 2 years ago
- ☆13Updated 3 months ago
- SAM-CLIP module for use with Autodistill.☆13Updated last year
- HSViT: Horizontally Scalable Vision Transformer☆13Updated 2 months ago
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆33Updated 3 years ago
- [BMVC'23 Oral] Offical repository of "Rethinking Transfer Learning for Medical Image Classification"☆11Updated 7 months ago
- EfficientViT is a new family of vision models for efficient high-resolution vision.☆24Updated last year