facebookresearch / Data_Acquisition_for_ML_Benchmark
DAM Data Acquisition for ML Benchmark, as part of the DataPerf benchmark suite, https://dataperf.org/
☆22Updated last year
Related projects: ⓘ
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆77Updated last year
- ☆29Updated last year
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆28Updated 4 months ago
- ☆48Updated 3 months ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆35Updated 2 years ago
- Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)☆43Updated last year
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆38Updated last year
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆15Updated 10 months ago
- ☆21Updated this week
- ☆23Updated this week
- Using FlexAttention to compute attention with different masking patterns☆28Updated last week
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆58Updated this week
- Official code for the paper: "Metadata Archaeology"☆18Updated last year
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆58Updated 3 years ago
- codebase for the SIMAT dataset and evaluation☆38Updated 2 years ago
- ☆17Updated last year
- Directed masked autoencoders☆13Updated last year
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆52Updated last year
- ImageNet-12k subset of ImageNet-21k (fall11)☆19Updated last year
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆47Updated 2 years ago
- Sparse Backpropagation for Mixture-of-Expert Training☆17Updated 2 months ago
- ☆20Updated last year
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated 11 months ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆87Updated last year
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆58Updated 2 years ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆17Updated 10 months ago
- ☆13Updated 2 years ago
- Lightning support for Intel Habana accelerators.☆25Updated 2 weeks ago
- Repository for the paper Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning☆37Updated last year
- Code for the PAPA paper☆27Updated last year