Pre-train BERT from scratch, with HuggingFace. Accompanies the blog post: sidsite.com/posts/bert-from-scratch
☆43May 20, 2025Updated 10 months ago
Alternatives and similar repositories for pretraining-BERT
Users that are interested in pretraining-BERT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Dec 2, 2024Updated last year
- ☆34Apr 23, 2023Updated 2 years ago
- Code for Augment & Reduce, a scalable stochastic algorithm for large categorical distributions☆10May 16, 2018Updated 7 years ago
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆31Sep 12, 2025Updated 6 months ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- JAX implementation of the T5 model: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer☆24Jun 10, 2023Updated 2 years ago
- Jupyter notebooks from our weekly (or so) hackathons☆11Dec 3, 2024Updated last year
- Code for "Exponential Family Estimation via Adversarial Dynamics Embedding" (NeurIPS 2019)☆14Nov 26, 2019Updated 6 years ago
- ☆34Aug 23, 2023Updated 2 years ago
- Patch for MPT-7B which allows using and training a LoRA☆58May 20, 2023Updated 2 years ago
- Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC☆19Oct 22, 2023Updated 2 years ago
- Code for Semi-crowdsourced Clustering with Deep Generative Models☆12Dec 9, 2022Updated 3 years ago
- Code accompanying VarGrad: A Low-Variance Gradient Estimator for Variational Inference☆12Oct 12, 2020Updated 5 years ago
- This repository contains the source code, models and data files for the work titled: "Unsupervised Image Style Embeddings for Retrieval a…☆13May 29, 2021Updated 4 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- A large-image collection explorer and fast classification tool☆24Jul 12, 2022Updated 3 years ago
- ☆12Sep 5, 2021Updated 4 years ago
- ☆11Oct 21, 2017Updated 8 years ago
- Posterior with interesting shapes from actually used models☆13Feb 10, 2025Updated last year
- Knowledge distillation on DNABERT (DistilBERT and MiniLM techniques) for promoter identification.☆25Nov 3, 2022Updated 3 years ago
- Some examples and tests with LicheeRV Nano☆30Aug 9, 2025Updated 7 months ago
- PyTorch utilities for ML, specifically speech☆13Jan 30, 2024Updated 2 years ago
- ☆10Dec 4, 2018Updated 7 years ago
- ☆18Nov 23, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Tree-Invent: A novel molecular generative model constrained with topological tree☆13Jul 26, 2023Updated 2 years ago
- Scalable In-Memory Acceleration With Mesh: Device, Circuits, Architecture, and Algorithm☆16Oct 11, 2020Updated 5 years ago
- An attempt to create a free PROFINET daemon☆16Oct 24, 2018Updated 7 years ago
- ☆12Jun 13, 2021Updated 4 years ago
- Small repository for my video on LoRA☆16May 14, 2023Updated 2 years ago
- The implementation of FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design (NeurIPS 2025)☆35Nov 19, 2025Updated 4 months ago
- ☆12Apr 20, 2023Updated 2 years ago
- ☆17Feb 24, 2025Updated last year
- ☆19Jun 10, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Aug 20, 2024Updated last year
- Unofficial Implementation of Selective Attention Transformer☆21Oct 31, 2024Updated last year
- code supplement for variational boosting (https://arxiv.org/abs/1611.06585)☆11Jul 24, 2017Updated 8 years ago
- ☆13Mar 10, 2026Updated 2 weeks ago
- ☆13Aug 15, 2024Updated last year
- Efficient 3bit/4bit quantization of LLaMA models☆18May 18, 2023Updated 2 years ago
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year