sradc / pretraining-BERTLinks

Pre-train BERT from scratch, with HuggingFace. Accompanies the blog post: sidsite.com/posts/bert-from-scratch

☆41

Alternatives and similar repositories for pretraining-BERT

Users that are interested in pretraining-BERT are comparing it to the libraries listed below

Sorting:

abacaj / train-with-fsdp
☆92Updated last year
KhoomeiK / complexity-scaling
gzip Predicts Data-dependent Scaling Laws
☆35Updated last year
NousResearch / StripedHyenaTrainer
☆61Updated last year
LiibanMo / scikit-jax
Your favourite classical machine learning algos on the GPU/TPU
☆20Updated 5 months ago
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆100Updated last year
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆58Updated 4 months ago
HomebrewML / HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
☆68Updated 2 years ago
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated last year
CarperAI / treasure_trove
☆22Updated last year
ChrisHayduk / QLoRA-for-MLM
QLoRA for Masked Language Modeling
☆22Updated last year
joey00072 / microjax
Jax like function transformation engine but micro, microjax
☆32Updated 8 months ago
warner-benjamin / commented-transformers
Highly commented implementations of Transformers in PyTorch
☆136Updated last year
hundredblocks / large-model-parallelism
Functional local implementations of main model parallelism approaches
☆95Updated 2 years ago
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆78Updated 6 months ago
jfpuget / ARC-AGI-Challenge-2024
☆55Updated 7 months ago
dvruette / barrel-rec-pytorch
☆53Updated last year
Alignment-Lab-AI / datagen
a pipeline for using api calls to agnostically convert unstructured data into structured training data
☆30Updated 9 months ago
stas00 / ml-ways
ML/DL Math and Method notes
☆61Updated last year
SamsungSAILMontreal / AnyMolGenCritic
☆21Updated 2 months ago
krypticmouse / matryoshka-representation-learning
PyTorch implementation for MRL
☆18Updated last year
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆81Updated 3 years ago
okarthikb / state-space-models
☆27Updated 11 months ago
kmkolasinski / nano-umap
Simplified implementation of UMAP like dimensionality reduction algorithm
☆49Updated 7 months ago
srush / Tensor-Puzzles-Penzai
☆20Updated last year
drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆56Updated last week
lucidrains / gateloop-transformer
Implementation of GateLoop Transformer in Pytorch and Jax
☆89Updated last year
epfml / DenseFormer
☆81Updated last year
xjdr-alt / muzero_sketch
☆38Updated 11 months ago
lucaslingle / mu_transformer
Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.
☆30Updated 3 weeks ago
crowsonkb / LDLM
Latent Diffusion Language Models
☆68Updated last year