Big-Interleaved-Dataset
☆58Jan 21, 2023Updated 3 years ago
Alternatives and similar repositories for Big-Interleaved-Dataset
Users that are interested in Big-Interleaved-Dataset are comparing it to the libraries listed below
Sorting:
- ☆17Oct 18, 2022Updated 3 years ago
- ☆42Jun 15, 2023Updated 2 years ago
- [ICLR 2024 Spotlight] Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Communi…☆11Mar 29, 2024Updated last year
- [Findings of ACL-2023] This is the official implementation of On the Difference of BERT-style and CLIP-style Text Encoders.☆14Jun 7, 2023Updated 2 years ago
- Script and models for clustering LAION-400m CLIP embeddings.☆26Jan 10, 2022Updated 4 years ago
- Implementation of QA Networks☆10Jul 14, 2016Updated 9 years ago
- ViT trained on COYO-Labeled-300M dataset☆33Nov 24, 2022Updated 3 years ago
- Post-processing for fair classification☆16Jun 30, 2025Updated 8 months ago
- Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...☆320Dec 9, 2023Updated 2 years ago
- reproduces experiments from "Grounding inductive biases in natural images: invariance stems from variations in data"☆17Sep 25, 2024Updated last year
- ☆18Nov 7, 2022Updated 3 years ago
- COYO-700M: Large-scale Image-Text Pair Dataset☆1,252Nov 30, 2022Updated 3 years ago
- Fuel innovation and advance language models with HomoScriptor: A vibrant, community-driven dataset for fine-tuning large language models.☆18Oct 14, 2023Updated 2 years ago
- A subset of YFCC100M. Tools, checking scripts and links of web drive to download datasets(uncompressed).☆19Nov 13, 2024Updated last year
- SOIT: Segmenting Objects with Instance-Aware Transformers☆14Jun 6, 2022Updated 3 years ago
- ☆20Feb 22, 2021Updated 5 years ago
- SVIT: Scaling up Visual Instruction Tuning☆166Jun 20, 2024Updated last year
- Implementation of the paper ''Implicit Feature Refinement for Instance Segmentation''.☆20Oct 27, 2021Updated 4 years ago
- i-mae Pytorch Repo☆20Apr 6, 2024Updated last year
- Un-*** 50 billions multimodality dataset☆23Sep 14, 2022Updated 3 years ago
- PyTorch implementation of IndRNN☆15Sep 12, 2018Updated 7 years ago
- Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…☆211Aug 28, 2024Updated last year
- GPU controlled Hetzner Cloud workers swarm for Crawling@Home project☆58Oct 9, 2022Updated 3 years ago
- Unified notation for Markov Decision Processes PO(MDP)s☆24Apr 27, 2018Updated 7 years ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 2 years ago
- Get hundred of million of image+url from the crawling at home dataset and preprocess them☆223May 26, 2024Updated last year
- Caffe fork that supports training with weighted samples http://caffe.berkeleyvision.org/☆22Aug 3, 2015Updated 10 years ago
- WIP☆21Jul 9, 2017Updated 8 years ago
- Multi-modality pre-training☆510May 8, 2024Updated last year
- Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)☆188Jun 21, 2025Updated 8 months ago
- VITA: Video Instance Segmentation via Object Token Association (NeurIPS 2022)☆105Jan 4, 2024Updated 2 years ago
- [ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning framework.☆110Dec 8, 2023Updated 2 years ago
- ☆29Oct 18, 2022Updated 3 years ago
- Code and data for the CoNLL 2018 paper "Adversarially Regularising Neural NLI Models to Integrate Logical Background Knowledge."☆25Jan 21, 2019Updated 7 years ago
- Conditional Random Fields☆27Apr 23, 2022Updated 3 years ago
- The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"☆32Feb 6, 2026Updated 3 weeks ago
- Official repository for "Revisiting Weakly Supervised Pre-Training of Visual Perception Models". https://arxiv.org/abs/2201.08371.☆182Apr 17, 2022Updated 3 years ago
- The SVO-Probes Dataset for Verb Understanding☆30Jan 28, 2022Updated 4 years ago
- DataComp: In search of the next generation of multimodal datasets☆772Apr 28, 2025Updated 10 months ago