rasbt / b3-basic-batchsize-benchmarkLinks

Experiments for the blog post "No, We Don't Have to Choose Batch Sizes As Powers Of 2"

☆20

Alternatives and similar repositories for b3-basic-batchsize-benchmark

Users that are interested in b3-basic-batchsize-benchmark are comparing it to the libraries listed below

Sorting:

yiyixuxu / n-grammer-flax
Implementation of N-Grammer in Flax
☆17Updated 2 years ago
facebookresearch / coocmap
code for paper "Accessing higher dimensions for unsupervised word translation"
☆21Updated 2 years ago
crypdick / timm-lr-scheduler-explorer
A dashboard for exploring timm learning rate schedulers
☆19Updated 7 months ago
quinte22 / bumblebee
bumble bee transformer
☆14Updated 4 years ago
YeonwooSung / GLOM
PyTorch implementation of GLOM
☆22Updated 3 years ago
google-research / precondition
☆31Updated last week
google-research-datasets / QAmeleon
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…
☆34Updated last year
tchaton / pytorch2lightning
☆15Updated 3 years ago
ahennequ / pytorch-custom-mma
☆29Updated 2 years ago
frozentoad9 / CMST
Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages
☆13Updated 2 years ago
microsoft / GEM
☆24Updated 4 years ago
kyegomez / MM1
PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"
☆24Updated this week
allenai / smashed
SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…
☆33Updated last year
huggingface / pixparse
Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
☆21Updated 10 months ago
ChenghaoMou / embeddings
zero-vocab or low-vocab embeddings
☆18Updated 2 years ago
orevaahia / magnet-tokenization
☆12Updated 6 months ago
yikangshen / megablocks
☆20Updated last year
qdrant / quaterion-models
The collection of bulding blocks building fine-tunable metric learning models
☆32Updated 2 months ago
NathanGodey / headless-lm
Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…
☆27Updated last year
craffel / jax-tutorial
A tutorial on JAX (https://github.com/google/jax/)
☆46Updated 6 years ago
toizzy / deep-subjecthood
Code for running the experiments in Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT
☆17Updated last year
d2l-ai / d2l-book-colab
Colab notebooks for d2l-book
☆11Updated 5 years ago
srush / g9py
☆19Updated last year
data2ml / all-clip
Load any clip model with a standardized interface
☆21Updated last year
lucidrains / tableformer-pytorch
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆39Updated 3 years ago
vsahil / MIMETIC-2
Official Code for MIMETIC^2
☆12Updated 7 months ago
seraphlabs-ca / MIM
Code for "MIM: Mutual Information Machine" paper.
☆16Updated 2 years ago
philschmid / optimum-static-quantization
☆28Updated 2 years ago
iKernels / transformers-lightning
A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses and Loggers to better integrate pytorch-lightning with transfor…
☆47Updated 2 years ago
samiraabnar / Reflect
Official Implementation of "Transferring Inductive Biases Through Knowledge Distillation"
☆14Updated 5 years ago