jaymody / seq2seq-polynomialLinks
Seq2seq transformer for polynomial expansion in PyTorch.
☆28Updated 4 years ago
Alternatives and similar repositories for seq2seq-polynomial
Users that are interested in seq2seq-polynomial are comparing it to the libraries listed below
Sorting:
- Shared code for training sentence embeddings with Flax / JAX☆27Updated 3 years ago
- Helper scripts and notes that were used while porting various nlp models☆46Updated 3 years ago
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated last year
- Minimalist BERT implementation assignment for CS11-711☆83Updated 2 years ago
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆19Updated 3 years ago
- Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! 🔥🚀💻☆14Updated last year
- Functional local implementations of main model parallelism approaches☆95Updated 2 years ago
- ☆17Updated 2 years ago
- MinT: Minimal Transformer Library and Tutorials☆256Updated 2 years ago
- ☆46Updated 5 years ago
- Annotations of the interesting ML papers I read☆242Updated last week
- Neural information retrieval / Semantic search / Bi-encoders☆170Updated last year
- ☆18Updated this week
- ☆179Updated last year
- Resources from the EleutherAI Math Reading Group☆53Updated 4 months ago
- Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale, TACL (2022)☆127Updated last week
- Lightning template for easy prototyping⚡️☆13Updated 2 years ago
- Module 0 - Fundamentals☆103Updated 10 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆105Updated 3 months ago
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆27Updated 2 years ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)☆45Updated 4 years ago
- An assignment for CMU CS11-711 Advanced NLP, building NLP systems from scratch☆171Updated 2 years ago
- Some notebooks for NLP☆204Updated last year
- Code associated to papers on superposition (in ML interpretability)☆28Updated 2 years ago
- ☆100Updated 2 years ago
- GHOSTS dataset☆38Updated last year
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆136Updated last year
- JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al. (https://arxiv.org/abs/1706.03762)☆15Updated 3 years ago
- Code accompanying our papers on the "Generative Distributional Control" framework☆118Updated 2 years ago