barneyhill / minBERTLinks
A minimal PyTorch implementation of BERT (Bidirectional Encoder Representations from Transformers)
☆11Updated 2 years ago
Alternatives and similar repositories for minBERT
Users that are interested in minBERT are comparing it to the libraries listed below
Sorting:
- A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.☆980Updated last year
- Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch☆94Updated 2 years ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆186Updated last year
- List of academic resources on Multimodal ML for Music☆297Updated 2 years ago
- TorchDR - PyTorch Dimensionality Reduction☆186Updated this week
- ☆87Updated 2 years ago
- Efficient implementation (and explorations) into polar coordinate positional embedding (PoPE) - from Gopalakrishnan et al. under Schmidhu…☆51Updated 2 weeks ago
- 🧬 Generative modeling of regulatory DNA sequences with diffusion probabilistic models 💨☆462Updated last month
- This repository contains the implementation of **Alternators**, a novel family of generative models for time-dependent data.☆35Updated 8 months ago
- Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders☆278Updated 3 months ago
- A minimal implementation of diffusion models for text generation☆408Updated 2 years ago
- A simple implimentation of Bayesian Flow Networks (BFN)☆241Updated 2 years ago
- Repository for StripedHyena, a state-of-the-art beyond Transformer architecture☆409Updated last year
- Using JAX to generate piano music as MIDI☆39Updated 2 years ago
- Implementation of the proposed minGRU in Pytorch☆319Updated 2 months ago
- ☆215Updated last year
- Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch☆557Updated 7 months ago
- (Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (https://arxiv.org/abs/2307…☆52Updated 2 years ago
- Implementation of the proposed Spline-Based Transformer from Disney Research☆105Updated last year
- ☆38Updated 3 months ago
- The boundary of neural network trainability is fractal☆222Updated 2 years ago
- Explorations into whether a transformer with RL can direct a genetic algorithm to converge faster☆71Updated 8 months ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆172Updated last week
- Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.…☆105Updated last year
- Implementation of AlphaGenome, Deepmind's updated genomic attention model☆95Updated last week
- A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.☆195Updated 2 years ago
- ☆168Updated 3 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆126Updated last year
- This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.☆367Updated 2 years ago
- A simple, hackable text-to-speech system in PyTorch and MLX☆187Updated 6 months ago