albanie / foundation-modelsLinks

Video descriptions of research papers relating to foundation models and scaling

☆30

Alternatives and similar repositories for foundation-models

Users that are interested in foundation-models are comparing it to the libraries listed below

Sorting:

kirill-vish / Beyond-INet
Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"
☆101Updated last year
lucidrains / MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
☆103Updated 2 years ago
jonkahana / CLIPPR
An official PyTorch implementation for CLIPPR
☆29Updated 2 years ago
jeykigung / HiCLIP
☆30Updated 2 years ago
hammoudhasan / DiversitySSL
Original code base for On Pretraining Data Diversity for Self-Supervised Learning
☆14Updated 10 months ago
facebookresearch / clip-rocket
Code release for "Improved baselines for vision-language pre-training"
☆61Updated last year
ilkerkesen / ViLMA
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Updated last year
facebookresearch / ViP-MAE
This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision
☆36Updated 2 years ago
apple / ml-tic-clip
Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models" ICLR 2024
☆108Updated last year
facebookresearch / maws
Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496
☆91Updated 7 months ago
jacobmarks / awesome-neurips-2023
Conference schedule, top papers, and analysis of the data for NeurIPS 2023!
☆121Updated last year
bfshi / TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
☆186Updated 2 years ago
lucidrains / AMIE-pytorch
Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind
☆68Updated last year
rasbt / cvpr2023
☆134Updated 2 years ago
yueliukth / PatchDropout
☆19Updated 2 years ago
lucidrains / zorro-pytorch
Implementation of Zorro, Masked Multimodal Transformer, in Pytorch
☆96Updated 2 years ago
fangyuan-ksgk / Mini-LLaVA
A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.
☆96Updated 11 months ago
TomerRonen34 / mixed-resolution-vit
☆54Updated 2 years ago
facebookresearch / imagenetx
understanding model mistakes with human annotations
☆106Updated 2 years ago
Optimization-AI / FastCLIP
Distributed Optimization Infra for learning CLIP models
☆27Updated last year
SriramB-98 / vit-decompose
☆23Updated 10 months ago
LAION-AI / General-GPT
☆65Updated 2 years ago
facebookresearch / CiT
Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".
☆78Updated 2 years ago
iancovert / locality-alignment
☆53Updated 10 months ago
facebookresearch / RCDM
Visualizing representations with diffusion based conditional generative model.
☆102Updated 2 years ago
bethgelab / frequency_determines_performance
Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurI…
☆92Updated last year
ExplainableML / fomo_in_flux
Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]
☆60Updated 11 months ago
eric-ai-lab / ComCLIP
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
☆37Updated last year
facebookresearch / dropout
Code release for "Dropout Reduces Underfitting"
☆315Updated 2 years ago
OscarXZQ / weight-selection
☆186Updated last year