OptimalFoundation / nadirLinks
Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! ๐ฅ๐๐ป
โ14Updated last year
Alternatives and similar repositories for nadir
Users that are interested in nadir are comparing it to the libraries listed below
Sorting:
- Exploring finetuning public checkpoints on filter 8K sequences on Pileโ116Updated 2 years ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)โ189Updated 3 years ago
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ87Updated last year
- Experiments with generating opensource language model assistantsโ97Updated 2 years ago
- A case study of efficient training of large language models using commodity hardware.โ68Updated 3 years ago
- Amos optimizer with JEstimator lib.โ82Updated last year
- โ66Updated 3 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.โ96Updated 2 years ago
- Experiments for efforts to train a new and improved t5โ76Updated last year
- โ50Updated last year
- โ94Updated 2 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Trainingโ51Updated last year
- Various handy scripts to quickly setup new Linux and Windows sandboxes, containers and WSL.โ40Updated 2 weeks ago
- ML/DL Math and Method notesโ64Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)โ104Updated last month
- โ62Updated 3 years ago
- Minimal code to train a Large Language Model (LLM).โ172Updated 3 years ago
- Efficiently computing & storing token n-grams from large corporaโ26Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.โ27Updated 2 years ago
- Like picoGPT but for BERT.โ51Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.โ49Updated 2 years ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.โ53Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)โ61Updated 2 years ago
- Serialize JAX, Flax, Haiku, or Objax model params with ๐ค`safetensors`โ47Updated last year
- some common Huggingface transformers in maximal update parametrization (ยตP)โ87Updated 3 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/โฆโ28Updated last year
- โ34Updated 2 years ago
- โ166Updated 2 years ago
- Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale, TACL (2022)โ133Updated 5 months ago
- MinT: Minimal Transformer Library and Tutorialsโ259Updated 3 years ago