s-smits/modernbert-finetune

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/s-smits/modernbert-finetune)

s-smits / modernbert-finetune

Fine-tune ModernBERT with custom tokenizers, curriculum learning, and next-gen optimizers.

☆74

Alternatives and similar repositories for modernbert-finetune

Users that are interested in modernbert-finetune are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

taylorai / onnx_embedding_models
View on GitHub
utilities for loading and running text embeddings with onnx
☆46Aug 16, 2025Updated 11 months ago
JHU-CLSP / mmBERT
View on GitHub
A massively multilingual modern encoder language model
☆145Jan 20, 2026Updated 5 months ago
neavo / KeywordGachaModel
View on GitHub
☆17Jan 31, 2025Updated last year
MinishLab / tokenlearn
View on GitHub
Pre-train Static Word Embeddings
☆107Jun 9, 2026Updated last month
Knowledgator / FlashDeBERTa
View on GitHub
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆90Feb 10, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hieudx149 / X-RetroMAE
View on GitHub
Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
☆10Mar 16, 2023Updated 3 years ago
enjalot / latent-data-modal
View on GitHub
Using modal.com to process FineWeb-edu data
☆20Apr 11, 2026Updated 3 months ago
AnswerDotAI / ModernBERT
View on GitHub
Bringing BERT into modernity via both architecture changes and scaling
☆1,700Mar 1, 2026Updated 4 months ago
QunBB / bert-pretraining
View on GitHub
BERT&RoBERTa预训练代码，tensorflow和torch两种版本实现
☆13Feb 8, 2023Updated 3 years ago
noe-eva / NOAH-Corpus
View on GitHub
NOAH's Corpus: Part-of-Speech Tagging for Swiss German
☆12Jan 6, 2023Updated 3 years ago
anpaure / cp_eval
View on GitHub
Tiny evaluation of leading LLMs on competitive programming problems
☆14Apr 10, 2026Updated 3 months ago
illuin-tech / contextual-embeddings
View on GitHub
Model implementation for the contextual embeddings project
☆47Jun 2, 2025Updated last year
chameleon-lizard / Ragaliq
View on GitHub
Multilingual RAG benchmark.
☆11Nov 22, 2024Updated last year
N8python / binary-vectors-mlx
View on GitHub
MLX binary vectors and associated algorithms.
☆14Mar 13, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
nbroad1881 / strideformer
View on GitHub
Using short models to classify long texts
☆21Mar 8, 2023Updated 3 years ago
pappitti / modernbert-mlx
View on GitHub
Implementation of ModernBERT in MLX
☆21Jan 7, 2026Updated 6 months ago
proycon / spacy2folia
View on GitHub
Use spaCy for NLP and output to the FoLiA XML format.
☆12Feb 27, 2024Updated 2 years ago
Mihaiii / backtrack_sampler
View on GitHub
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆151Jan 7, 2026Updated 6 months ago
minosvasilias / simple_grpo
View on GitHub
Simple GRPO scripts and configurations.
☆59Feb 6, 2025Updated last year
frankkramer-lab / GERNERMED-pp
View on GitHub
GERNERMED++ is a transfer-learning-based open neural NER model for medical entities designed for German data.
☆10Oct 20, 2023Updated 2 years ago
joey00072 / Attention-as-graph
View on GitHub
alternative way to calculating self attention
☆18May 25, 2024Updated 2 years ago
sb895 / Hallmarks-of-Cancer
View on GitHub
Expert annotated Hallmarks of Cancer Corpus
☆22Sep 18, 2018Updated 7 years ago
dnakov / llm-asi-arch
View on GitHub
🤖 Complete reproduction of 'AlphaGo Moment for Model Architecture Discovery' using MLX-LM instead of GPT-4. Autonomous neural architectu…
☆29Jul 27, 2025Updated 11 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
frankkramer-lab / GPTNERMED
View on GitHub
GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.
☆15Oct 5, 2023Updated 2 years ago
oceanumeric / EnteRAG
View on GitHub
A RAG that can scale 🧑🏻‍💻
☆11May 28, 2024Updated 2 years ago
ariG23498 / smart-commit
View on GitHub
Smart commit messages
☆18Oct 25, 2024Updated last year
mixedbread-ai / batched
View on GitHub
The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…
☆161Jul 14, 2025Updated last year
davidberenstein1957 / spacy-setfit
View on GitHub
This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.
☆84Aug 31, 2023Updated 2 years ago
jfkback / hypencoder-paper
View on GitHub
Official Repository for "Hypencoder: Hypernetworks for Information Retrieval"
☆40Sep 20, 2025Updated 9 months ago
talmago / spacy_crfsuite
View on GitHub
sequence tagging with spaCy and crfsuite
☆21Mar 18, 2023Updated 3 years ago
ZeroSumEval / ZeroSumEval
View on GitHub
A framework for pitting LLMs against each other in an evolving library of games ⚔
☆35Apr 17, 2025Updated last year
kpu / fasterText
View on GitHub
Library for fast text representation and classification.
☆31Jan 9, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
rachittshah / optimize-anything
View on GitHub
Universal text artifact optimizer using LLM-powered iterative search
☆17Mar 3, 2026Updated 4 months ago
Pleias / Pleias-RAG-Library
View on GitHub
Python library to use Pleias-RAG models
☆72Jul 1, 2026Updated 2 weeks ago
TheSethRose / Agent-Chat
View on GitHub
An advanced AI-powered conversational agent leveraging the Llama 3.2 model and Phidata framework. Features include reasoning, natural lan…
☆15Oct 29, 2024Updated last year
wenlai-lavine / jola
View on GitHub
Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning
☆28Jun 18, 2025Updated last year
embeddings-benchmark / leaderboard
View on GitHub
Code for the MTEB leaderboard
☆31Feb 4, 2025Updated last year
sidleal / porsimplessent
View on GitHub
PorSimplesSent - A Portuguese corpus of aligned sentences pairs to investigate sentence readability assessment
☆13Jan 15, 2020Updated 6 years ago
Techtonique / ahead_python
View on GitHub
Univariate and multivariate time series forecasting, with uncertainty quantification (Python & R)
☆13Dec 20, 2025Updated 6 months ago