rasbt / b3-basic-batchsize-benchmark
Experiments for the blog post "No, We Don't Have to Choose Batch Sizes As Powers Of 2"
☆20Updated 2 years ago
Alternatives and similar repositories for b3-basic-batchsize-benchmark:
Users that are interested in b3-basic-batchsize-benchmark are comparing it to the libraries listed below
- A dashboard for exploring timm learning rate schedulers☆19Updated 5 months ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated this week
- ☆15Updated 3 years ago
- Implementation of N-Grammer in Flax☆17Updated 2 years ago
- Local Attention - Flax module for Jax☆20Updated 3 years ago
- ☆31Updated last week
- reproduces experiments from "Grounding inductive biases in natural images: invariance stems from variations in data"☆17Updated 7 months ago
- PyTorch implementation of GLOM☆22Updated 3 years ago
- Contains Colab Notebooks show cool use-cases of different GCP ML APIs.☆10Updated 4 years ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated 2 weeks ago
- Official Code for MIMETIC^2☆12Updated 5 months ago
- Bi-Directional Attention Flow for Machine Comprehensions☆9Updated 7 years ago
- An implementation of Compositional Attention: Disentangling Search and Retrieval by MILA☆14Updated 2 years ago
- Describe the format of image/text datasets☆11Updated 2 years ago
- Researchers who published code, models (in some cases), and demo apps (in few cases) along with their SOTA paper☆12Updated last year
- Implements EvoNorms B0 and S0 as proposed in Evolving Normalization-Activation Layers.☆11Updated 5 years ago
- Experiments dashboard for LabML☆17Updated 2 years ago
- Simplifying parsing of large jsonline files in NLP Workflows☆12Updated 3 years ago
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Updated 3 years ago
- Shows how to do parameter ensembling using differential evolution.☆10Updated 3 years ago
- PyTorch reimplementation of the paper "HyperMixer: An MLP-based Green AI Alternative to Transformers" [arXiv 2022].☆17Updated 3 years ago
- Research code of Cycle Generative Adversarial Networks for Complementary Item Recommendations.☆18Updated 2 years ago
- Implementation of "Analysing Mathematical Reasoning Abilities of Neural Models"☆29Updated 2 years ago
- Visual Clustering: Clustering Plotted Data by Image Segmentation☆24Updated 2 months ago
- Directed masked autoencoders☆14Updated 2 years ago
- An open source implementation of CLIP.☆32Updated 2 years ago
- A simple implementation of a deep linear Pytorch module☆19Updated 4 years ago
- Tensorflow 2.x implementation of Gradient Origin Networks☆12Updated 4 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 3 years ago