StepanTita / nano-BERTView external linksLinks
Nano-BERT is a straightforward, lightweight and comprehensible custom implementation of BERT, inspired by the foundational "Attention is All You Need" paper. The primary objective of this project is to distill the essence of transformers by simplifying the complexities and unnecessary details.
☆20Oct 19, 2023Updated 2 years ago
Alternatives and similar repositories for nano-BERT
Users that are interested in nano-BERT are comparing it to the libraries listed below
Sorting:
- ☆11Dec 9, 2020Updated 5 years ago
- Seminars on optimization methods☆32Nov 2, 2021Updated 4 years ago
- This package will help you perform a multiple minumum Monte Carlo conformer search as described in Chang et al., 1989. It is built to be …☆32Jan 22, 2026Updated 3 weeks ago
- fast approximation for levenshtein distances☆11Jan 15, 2018Updated 8 years ago
- The Polaris datasets and benchmarks recipes☆12May 26, 2025Updated 8 months ago
- ☆11Oct 15, 2023Updated 2 years ago
- Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)☆11Aug 24, 2024Updated last year
- Text Services Framework Sample Code (Link Broken)☆12Mar 25, 2018Updated 7 years ago
- Python toolbox to analyse fracture networks for digitalized rock outcrops.☆13Jul 24, 2025Updated 6 months ago
- ☆14Jul 12, 2022Updated 3 years ago
- Word Embeddings for Low Resource Languages: The Case of Buryat☆10Mar 12, 2025Updated 11 months ago
- Collaborative inference of latent diffusion via hivemind☆12May 29, 2023Updated 2 years ago
- ⛰️ PrexSyn: Efficient and Programmable Exploration of Synthesizable Chemical Space☆34Feb 2, 2026Updated 2 weeks ago
- A Java JNI wrapper for KenLM: Faster and Smaller Language Model Queries☆14Oct 25, 2020Updated 5 years ago
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 4 months ago
- A custom Huggingface trainer which supports logging auxiliary losses returned by your model☆15Jul 27, 2025Updated 6 months ago
- LLM Assisted Geology Descriptions of Arbitrary Locations = LAGDAL☆14Jun 23, 2024Updated last year
- An implementation of the Equivariant Graph Neural Network (EGNN) layer type for DGL-PyTorch.☆15Dec 27, 2022Updated 3 years ago
- A repository for reproducing experiments from the TxPert paper☆25Jan 9, 2026Updated last month
- [ICML 2025] Repurposing pre-trained score-based generative models for transition path sampling by minimizing the Onsager-Machlup (OM) act…☆26Jan 14, 2026Updated last month
- Minimalistic, hackable PyTorch implementation of SimSiam in ~400 lines. Achieves good performance on ImageNet with ResNet50. Features dis…☆21Nov 25, 2024Updated last year
- Pack python venv in one☆16Dec 15, 2025Updated 2 months ago
- ☆14Jul 24, 2025Updated 6 months ago
- RND1: Scaling Diffusion Language Models☆172Jan 12, 2026Updated last month
- Java port of wolfgarbe/PruningRadixTrie☆16Jun 29, 2021Updated 4 years ago
- 非官方的MDCSpell论文的实现☆18Oct 16, 2022Updated 3 years ago
- Code for the paper "Secure Distributed Training at Scale" (ICML 2022)☆16Feb 4, 2025Updated last year
- ☆17Sep 24, 2024Updated last year
- Building Blocks for Equivariant Neural Networks in e3nn and PyTorch 2.0☆18Nov 16, 2025Updated 3 months ago
- Jax / Haiku implementation of DimeNet++.☆18Mar 31, 2022Updated 3 years ago
- Maze navigation with MLM-U☆17Dec 21, 2024Updated last year
- Blog post☆17Feb 16, 2024Updated 2 years ago
- Official awesome-list of Neon Postgres Database Starters & Resources ⚡️☆26Jul 29, 2024Updated last year
- Zero Shot Molecular Generation via Similarity Kernels☆28Aug 27, 2025Updated 5 months ago
- Atomistic machine learning models you can use everywhere for everything☆33Updated this week
- CRF(Conditional Random Field) Layer for TensorFlow 1.X with many powerful functions☆15Jan 3, 2020Updated 6 years ago
- ☆23Aug 1, 2023Updated 2 years ago
- ☆22Aug 9, 2024Updated last year
- MESS: Modern Electronic Structure Simulations☆20Sep 24, 2024Updated last year