OptimalFoundation / nadirLinks
Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! π₯ππ»
β14Updated last year
Alternatives and similar repositories for nadir
Users that are interested in nadir are comparing it to the libraries listed below
Sorting:
- Large scale 4D parallelism pre-training for π€ transformers in Mixture of Experts *(still work in progress)*β87Updated last year
- A library to create and manage configuration files, especially for machine learning projects.β79Updated 3 years ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pileβ115Updated 2 years ago
- β166Updated 2 years ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.β52Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.β27Updated 2 years ago
- β66Updated 3 years ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)β188Updated 3 years ago
- Like picoGPT but for BERT.β50Updated 2 years ago
- β62Updated 3 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Trainingβ51Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)β104Updated last year
- Serialize JAX, Flax, Haiku, or Objax model params with π€`safetensors`β47Updated last year
- Various transformers for FSDP researchβ39Updated 2 years ago
- Code repository for the c-BTM paperβ107Updated 2 years ago
- ML/DL Math and Method notesβ64Updated last year
- Amos optimizer with JEstimator lib.β82Updated last year
- A library for squeakily cleaning and filtering language datasets.β47Updated 2 years ago
- Experiments for efforts to train a new and improved t5β75Updated last year
- β94Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (Β΅P)β85Updated 3 years ago
- This project shows how to derive the total number of training tokens from a large text dataset from π€ datasets with Apache Beam and Dataβ¦β27Updated 2 years ago
- β20Updated 2 years ago
- gzip Predicts Data-dependent Scaling Lawsβ34Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β61Updated 2 years ago
- Used for adaptive human in the loop evaluation of language and embedding models.β308Updated 2 years ago
- Efficiently computing & storing token n-grams from large corporaβ26Updated last year
- β49Updated last year
- A case study of efficient training of large language models using commodity hardware.β68Updated 3 years ago
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and teβ¦β43Updated last year