StepanTita/nano-BERT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/StepanTita/nano-BERT)

StepanTita / nano-BERT

Nano-BERT is a straightforward, lightweight and comprehensible custom implementation of BERT, inspired by the foundational "Attention is All You Need" paper. The primary objective of this project is to distill the essence of transformers by simplifying the complexities and unnecessary details.

☆21

Alternatives and similar repositories for nano-BERT

Users that are interested in nano-BERT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

amkatrutsa / seminars-fivt
View on GitHub
Seminars on optimization methods
☆32Nov 2, 2021Updated 4 years ago
lamhoangtung / SynthText-Japanese
View on GitHub
Code for generating synthetic Japanese text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta…
☆13Aug 30, 2019Updated 6 years ago
wcarvalho / jupyter_notebooks
View on GitHub
Jupyter notebooks for testing concepts
☆11Nov 9, 2017Updated 8 years ago
tianchiguaixia / ocr_recognition
View on GitHub
微调阿里开源的文字检测模型，利用合合识别返回的OCR结果作为初始训练数据，对模型进行优化训练，使其更加适应1万张图片的具体场景，提高文字识别的精度。
☆10Dec 9, 2024Updated last year
benldr / JPruningRadixTrie
View on GitHub
Java port of wolfgarbe/PruningRadixTrie
☆16Jun 29, 2021Updated 5 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
tianchiguaixia / qwen1.5-ner
View on GitHub
使用Qwen1.5-0.5B-Chat模型进行通用信息抽取任务的微调，旨在：验证生成式方法相较于抽取式NER的效果；为新手提供简易的模型微调流程，尽量减少代码量；大模型训练的数据格式处理。
☆14Sep 6, 2024Updated last year
dwcoates / leven-squash
View on GitHub
fast approximation for levenshtein distances
☆11Jan 15, 2018Updated 8 years ago
learning-at-home / collaborative-latent-diffusion
View on GitHub
Collaborative inference of latent diffusion via hivemind
☆12May 29, 2023Updated 3 years ago
Parquery / lanms
View on GitHub
☆17Oct 18, 2019Updated 6 years ago
srush / mamba-scans
View on GitHub
Blog post
☆17Feb 16, 2024Updated 2 years ago
balintmate / boltzmann-interpolations
View on GitHub
☆13Oct 15, 2023Updated 2 years ago
vaskonov / burvec
View on GitHub
Word Embeddings for Low Resource Languages: The Case of Buryat
☆10Mar 12, 2025Updated last year
fpganow / CryptoCurrencies
View on GitHub
LabVIEW FPGA Framework for Implementing Cryptomining Algorithms
☆19Jul 22, 2018Updated 8 years ago
polaris-hub / polaris-recipes
View on GitHub
The Polaris datasets and benchmarks recipes
☆15May 26, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
prescient-design / e3tools
View on GitHub
Building Blocks for Equivariant Neural Networks in e3nn and PyTorch 2.0
☆20Nov 16, 2025Updated 8 months ago
yandex-research / btard
View on GitHub
Code for the paper "Secure Distributed Training at Scale" (ICML 2022)
☆16Feb 4, 2025Updated last year
DavinciEvans / minutes-GPT
View on GitHub
Minutes GPT is a GPT tool that helps you quickly turn meeting recordings into minutes. Minutes GPT 是一个帮助你快速将会议录音转化为会议纪要的 GPT 工具
☆17Nov 20, 2023Updated 2 years ago
alexeykarnachev / dialogs_data_parsers
View on GitHub
Russian dialog datasets parsers and crawlers.
☆15Sep 6, 2021Updated 4 years ago
naba89 / custom_hf_trainer
View on GitHub
A custom Huggingface trainer which supports logging auxiliary losses returned by your model
☆15Jul 27, 2025Updated last year
IBM / superglue-mtl
View on GitHub
Boolean Question Answering with multi-task learning and uses large LM embeddings like BERT, RoBERTa
☆18Aug 30, 2019Updated 6 years ago
cshanjiewu / Algorithm_Interview_Notes-Chinese
View on GitHub
2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记
☆13Oct 6, 2018Updated 7 years ago
WadhwaniAI / pest-monitoring
View on GitHub
☆14Jul 24, 2025Updated last year
hsjang0 / boltz-as-FM
View on GitHub
☆16Feb 20, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
MichaelChatzidakis / Mn_Classifier_CNNs
View on GitHub
A convolutional neural network classifier for determining the oxidation state of Manganese
☆23Jan 6, 2019Updated 7 years ago
jiaor17 / EPT
View on GitHub
[Nature Communications] The implementation for the paper "An equivariant pretrained transformer for unified 3D molecular representation l…
☆15Jun 25, 2026Updated last month
illagrenan / tensorflow-serving-api-python3
View on GitHub
**UNOFFICIAL and redistributed** TensorFlow Serving API libraries for Python3. See DEPRECATION WARNING in README.
☆28Aug 16, 2018Updated 7 years ago
erwallace / neural-optimiser
View on GitHub
Batched optimisation algorithms for neural network potential driven molecular dynamics.
☆17Nov 27, 2025Updated 8 months ago
princefr / EfficientNet-Light-Head-RCNN
View on GitHub
Person Detection using the EfficientNet B0 and Light Head RCNN running at 12 FPS
☆23Sep 20, 2019Updated 6 years ago
vub-ai-lab / bdpi
View on GitHub
Sample-Efficient Reinforcement Learning with Bootstrapped Dual Policy Iteration
☆25Sep 9, 2019Updated 6 years ago
tummfm / jax-dimenet
View on GitHub
Jax / Haiku implementation of DimeNet++.
☆18Mar 31, 2022Updated 4 years ago
Bhattacharya-Lab / CASP15
View on GitHub
CASP15 performance benchmarking of the state-of-the-art protein structure prediction methods
☆16Dec 13, 2023Updated 2 years ago
prescient-design / jamun
View on GitHub
Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensembles
☆20May 16, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
turnerdan / joethecorpusrogan
View on GitHub
A corpus of speech from the Joe Rogan Experience podcast, consisting of 8.43 million words. It includes aligned TextGrids with phonetic a…
☆21Jan 26, 2020Updated 6 years ago
vklabmipt / implicit-unlikelihood-training
View on GitHub
Improving Neural Text Generation with Reinforcement Learning
☆23Jan 13, 2021Updated 5 years ago
atomicarchitects / PriceofFreedom
View on GitHub
[ICML'25] The Price of Freedom: Exploring Expressivity and Runtime Tradeoffs in Equivariant Tensor Products
☆19Jul 16, 2025Updated last year
rdilip / apt
View on GitHub
Adaptive tokenization for proteins
☆20Mar 4, 2026Updated 4 months ago
usmanm / mlx-esm
View on GitHub
MLX implementation of Meta's ESM-1 protein language model
☆21Apr 17, 2024Updated 2 years ago
yzhan238 / EvMine
View on GitHub
The source code used for paper "Unsupervised Key Event Detection from Massive Text Corpora", published in KDD 2022.
☆22Jul 15, 2023Updated 3 years ago
ML4MolSim / dit_mc
View on GitHub
Official implementation for paper: Sampling 3D Molecular Conformers with Diffusion Transformers (NeurIPS 2025)
☆19Feb 3, 2026Updated 5 months ago