Pre-train BERT from scratch, with HuggingFace. Accompanies the blog post: sidsite.com/posts/bert-from-scratch
☆43May 20, 2025Updated 11 months ago
Alternatives and similar repositories for pretraining-BERT
Users that are interested in pretraining-BERT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they cha…☆19Jan 11, 2025Updated last year
- Implementation of VQ-VAE with a GPT-style sampler in the JAX and Haiku ecosystem.☆11Nov 23, 2023Updated 2 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- CatMAE☆14Dec 13, 2023Updated 2 years ago
- Probe how GPT-n performs on statutory reasoning☆10Sep 17, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 8 months ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- JAX implementation of the T5 model: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer☆24Jun 10, 2023Updated 2 years ago
- Jupyter notebooks from our weekly (or so) hackathons☆11Dec 3, 2024Updated last year
- Библиотека-обертка, которая позволяет получить доступ к функционалу Quik из Python☆12Feb 16, 2024Updated 2 years ago
- Source code repository for our EMNLP paper on cross-domain claim identification☆14Oct 24, 2018Updated 7 years ago
- Calculate expected profit & loss for options☆15Aug 5, 2019Updated 6 years ago
- ☆36Aug 23, 2023Updated 2 years ago
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆33Sep 12, 2025Updated 7 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Patch for MPT-7B which allows using and training a LoRA☆58May 20, 2023Updated 2 years ago
- Code accompanying VarGrad: A Low-Variance Gradient Estimator for Variational Inference☆12Oct 12, 2020Updated 5 years ago
- Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC☆19Oct 22, 2023Updated 2 years ago
- ☆17Dec 23, 2021Updated 4 years ago
- Count tokens for OpenAI accurately with support for all parameters like name, functions.☆23May 14, 2024Updated last year
- ☆11Oct 21, 2017Updated 8 years ago
- ☆11Jul 25, 2021Updated 4 years ago
- EdX course from MIT on machine learning 6.86x☆11Dec 16, 2020Updated 5 years ago
- A Supabase MCP server compatible with cursor☆20Feb 13, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆10Dec 4, 2018Updated 7 years ago
- Deribit bot to Delta Hedge Strategy.☆13Oct 2, 2022Updated 3 years ago
- Tree-Invent: A novel molecular generative model constrained with topological tree☆14Jul 26, 2023Updated 2 years ago
- Scalable In-Memory Acceleration With Mesh: Device, Circuits, Architecture, and Algorithm☆16Oct 11, 2020Updated 5 years ago
- An automatic Gaussian process classifier.☆13May 28, 2016Updated 9 years ago
- ☆10Jan 23, 2019Updated 7 years ago
- Resources and documentation for UK Biobank to OMOP CDM v5.3.1 conversion☆10Oct 20, 2020Updated 5 years ago
- The official repository for AdaMuon☆37Aug 27, 2025Updated 8 months ago
- ☆17Feb 24, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆19Jun 10, 2024Updated last year
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Aug 20, 2024Updated last year
- Unofficial Implementation of Selective Attention Transformer☆20Oct 31, 2024Updated last year
- ☆13Aug 15, 2024Updated last year
- A hybrid retrieval-based question answering system built on BERT, Faiss, and ElasticSearch.☆24May 29, 2025Updated 11 months ago
- MeTube browser extension for send links of youtube videos from context menu☆27Apr 2, 2026Updated last month
- This repository is a collection of legal instruction datasets☆27Jul 12, 2024Updated last year