barneyhill / minBERT
A minimal PyTorch implementation of BERT (Bidirectional Encoder Representations from Transformers)
☆10Updated 2 years ago
Alternatives and similar repositories for minBERT:
Users that are interested in minBERT are comparing it to the libraries listed below
- Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders☆167Updated 2 months ago
- Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch☆87Updated 2 years ago
- A nano protein structure prediction model based on DeepMind's AlphaFold paper☆24Updated 10 months ago
- Explorations into whether a transformer with RL can direct a genetic algorithm to converge faster☆64Updated this week
- Implementation of the proposed minGRU in Pytorch☆285Updated last month
- Bare-bones implementations of some generative models in Jax: diffusion, normalizing flows, consistency models, flow matching, (beta)-VAEs…☆128Updated last year
- A minimal implementation of diffusion models for text generation☆360Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆60Updated 3 years ago
- My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other h…☆52Updated last year
- Bi-Directional Equivariant Long-Range DNA Sequence Modeling☆178Updated 3 months ago
- ☆150Updated 8 months ago
- GFlowNet library specialized for graph & molecular data☆238Updated last week
- ☆84Updated last year
- gRNAde: Geometric Deep Learning for 3D RNA inverse design (ICLR 2025 Spotlight)☆192Updated last month
- Reinforcement Learning example in Nim, playing tic tac toe. Based off original C version from the great Antirez☆13Updated 2 weeks ago
- TorchDR - PyTorch Dimensionality Reduction☆107Updated 2 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆96Updated last month
- A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.☆85Updated last year
- A modular, easy to extend GFlowNet library☆260Updated this week
- Getting crystal-like representations with harmonic loss☆182Updated 2 weeks ago
- Implementation snake game based on Diffusion model☆90Updated 3 months ago
- Using JAX to generate piano music as MIDI☆39Updated last year
- The AdEMAMix Optimizer: Better, Faster, Older.☆180Updated 7 months ago
- A curated list of resources about generative flow networks (GFlowNets).☆458Updated 6 months ago
- A repository for log-time feedforward networks☆221Updated last year
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆91Updated last year
- Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch☆474Updated 6 months ago
- Repository for StripedHyena, a state-of-the-art beyond Transformer architecture☆363Updated last year
- ☆106Updated 11 months ago
- (Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (https://arxiv.org/abs/2307…☆50Updated last year