Nano-BERT is a straightforward, lightweight and comprehensible custom implementation of BERT, inspired by the foundational "Attention is All You Need" paper. The primary objective of this project is to distill the essence of transformers by simplifying the complexities and unnecessary details.
☆21Oct 19, 2023Updated 2 years ago
Alternatives and similar repositories for nano-BERT
Users that are interested in nano-BERT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Dec 9, 2020Updated 5 years ago
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆15Jun 7, 2023Updated 2 years ago
- ☆13May 7, 2023Updated 3 years ago
- The Polaris datasets and benchmarks recipes☆13May 26, 2025Updated 11 months ago
- Collaborative inference of latent diffusion via hivemind☆12May 29, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [ICML'25] The Price of Freedom: Exploring Expressivity and Runtime Tradeoffs in Equivariant Tensor Products☆19Jul 16, 2025Updated 9 months ago
- This package will help you perform a multiple minumum Monte Carlo conformer search as described in Chang et al., 1989. It is built to be …☆33Apr 23, 2026Updated 2 weeks ago
- CASP15 performance benchmarking of the state-of-the-art protein structure prediction methods☆14Dec 13, 2023Updated 2 years ago
- A Java JNI wrapper for KenLM: Faster and Smaller Language Model Queries☆14Oct 25, 2020Updated 5 years ago
- An implementation of the Equivariant Graph Neural Network (EGNN) layer type for DGL-PyTorch.☆15Dec 27, 2022Updated 3 years ago
- Blog post☆17Feb 16, 2024Updated 2 years ago
- Word Embeddings for Low Resource Languages: The Case of Buryat☆10Mar 12, 2025Updated last year
- fast trainer for educational purposes☆26Updated this week
- SIMD instructions for faster distance calculations.☆25Apr 7, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Minimalistic, hackable PyTorch implementation of SimSiam in ~400 lines. Achieves good performance on ImageNet with ResNet50. Features dis…☆22Nov 25, 2024Updated last year
- Code for the paper "Secure Distributed Training at Scale" (ICML 2022)☆16Feb 4, 2025Updated last year
- A custom Huggingface trainer which supports logging auxiliary losses returned by your model☆15Jul 27, 2025Updated 9 months ago
- ☆14Jul 24, 2025Updated 9 months ago
- RND1: Scaling Diffusion Language Models☆180Feb 22, 2026Updated 2 months ago
- Boolean Question Answering with multi-task learning and uses large LM embeddings like BERT, RoBERTa☆18Aug 30, 2019Updated 6 years ago
- Jax / Haiku implementation of DimeNet++.☆18Mar 31, 2022Updated 4 years ago
- A repository for reproducing experiments from the TxPert paper☆27Mar 25, 2026Updated last month
- Improving Neural Text Generation with Reinforcement Learning☆23Jan 13, 2021Updated 5 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ⛰️ PrexSyn: Efficient and Programmable Exploration of Synthesizable Chemical Space☆50Apr 27, 2026Updated last week
- ☆30Mar 20, 2024Updated 2 years ago
- This repository contains the official implementation of the research paper: "Towards Training Large-Scale Pathology Foundation Models: fr…☆38Jan 17, 2025Updated last year
- An implementation of ESM2 in Equinox+JAX☆36Apr 20, 2026Updated 2 weeks ago
- Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)☆11Aug 24, 2024Updated last year
- Atomistic machine learning models you can use everywhere for everything☆38Updated this week
- Exploration into the Firefly algorithm in Pytorch☆41Feb 14, 2025Updated last year
- Gradio Client in Rust.☆30Apr 8, 2026Updated last month
- A minimal Notion blog starter boilerplate. Based on Travis Fischer's nextjs-notion-starter-kit.☆18Mar 3, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [ICLR'24] Symphony: Symmetry-Equivariant Point-Centered Spherical Harmonics for Molecule Generation☆30Feb 24, 2025Updated last year
- ☆31Nov 14, 2024Updated last year
- Expert-Curated Oncology Reports to Advance Language Model Inference☆34Apr 17, 2024Updated 2 years ago
- Python library for interacting with Verda (formerly DataCrunch) Public API☆33Apr 17, 2026Updated 3 weeks ago
- A flow matching model for generating conformational ensembles of protein backbones.☆43Apr 15, 2026Updated 3 weeks ago
- ☆27Aug 25, 2023Updated 2 years ago
- Code for the paper "Disentanglement by Nonlinear ICA with General Incompressible-flow Networks (GIN)" (2020)☆34Sep 27, 2021Updated 4 years ago