rasbt / b3-basic-batchsize-benchmark
Experiments for the blog post "No, We Don't Have to Choose Batch Sizes As Powers Of 2"
☆19Updated 2 years ago
Alternatives and similar repositories for b3-basic-batchsize-benchmark:
Users that are interested in b3-basic-batchsize-benchmark are comparing it to the libraries listed below
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated this week
- ☆15Updated 3 years ago
- bumble bee transformer☆14Updated 3 years ago
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Updated 3 years ago
- Bi-Directional Attention Flow for Machine Comprehensions☆9Updated 7 years ago
- PyTorch implementation of GLOM☆21Updated 2 years ago
- Implementation of "Analysing Mathematical Reasoning Abilities of Neural Models"☆29Updated last year
- Describe the format of image/text datasets☆11Updated 2 years ago
- ☆30Updated last month
- Implementation of N-Grammer in Flax☆16Updated 2 years ago
- Local Attention - Flax module for Jax☆20Updated 3 years ago
- Implements EvoNorms B0 and S0 as proposed in Evolving Normalization-Activation Layers.☆11Updated 4 years ago
- Official Implementation of "Transferring Inductive Biases Through Knowledge Distillation"☆14Updated 4 years ago
- Minimum Description Length probing for neural network representations☆18Updated last week
- An unofficial Python client library for Lambda Lab's Cloud Computing Platform☆13Updated last year
- Usable implementation of Mogrifier, a circuit for enhancing LSTMs and potentially other networks, from Deepmind☆17Updated 7 months ago
- Contains Colab Notebooks show cool use-cases of different GCP ML APIs.☆10Updated 4 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆31Updated 7 months ago
- Shows how to do parameter ensembling using differential evolution.☆10Updated 3 years ago
- ☆11Updated 2 years ago
- A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses and Loggers to better integrate pytorch-lightning with transfor…☆47Updated last year
- Directed masked autoencoders☆14Updated last year
- DEPRECATED--all functionality moved to nbdev☆15Updated 2 years ago
- Visual Clustering: Clustering Plotted Data by Image Segmentation☆24Updated 11 months ago
- Code for running the experiments in Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT☆16Updated last year