JHU-CLSP / mmBERTView external linksLinks
A massively multilingual modern encoder language model
☆126Jan 20, 2026Updated 3 weeks ago
Alternatives and similar repositories for mmBERT
Users that are interested in mmBERT are comparing it to the libraries listed below
Sorting:
- My NER Experiments with ModernBERT and Ettin☆26Jul 17, 2025Updated 6 months ago
- Model implementation for the contextual embeddings project☆40Jun 2, 2025Updated 8 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated last year
- ☆16Jan 31, 2025Updated last year
- Fine-tune ModernBERT with custom tokenizers, curriculum learning, and next-gen optimizers.☆74Jan 16, 2026Updated 3 weeks ago
- User-friendly viewer for Parquet files☆10Jan 10, 2026Updated last month
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆44Mar 6, 2024Updated last year
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Updated this week
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆47Jul 25, 2023Updated 2 years ago
- 🎹 Instruct.KR 2025 Summer Meetup: 오픈소스 LLM, vLLM으로 Production까지 🎹☆24Aug 2, 2025Updated 6 months ago
- BERT score for text generation☆12Jan 15, 2025Updated last year
- DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.☆15Mar 9, 2022Updated 3 years ago
- Evaluate state-of-the-art sparse embedding models on the LIMIT dataset (`limit-small` and `limit`) from google's paper `On the Theoretica…☆15Sep 4, 2025Updated 5 months ago
- [SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"☆19Mar 31, 2025Updated 10 months ago
- English or Chinses GPT2Dialog model from GPT2-chitchat☆12Feb 23, 2020Updated 5 years ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Aug 2, 2024Updated last year
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆63Dec 12, 2024Updated last year
- This repository helps you evaluate your models on the FreshStack benchmark!☆31Dec 9, 2025Updated 2 months ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 5 months ago
- ☆24Dec 11, 2024Updated last year
- Test-time compute in information retrieval☆52Jul 8, 2025Updated 7 months ago
- Generalised Contrastive Learning. This is a Repository for Google Shopping Dataset and Benchmarks followed by our novel fine-grained cont…☆72Dec 30, 2025Updated last month
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- ☆46Apr 13, 2022Updated 3 years ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆90Jan 9, 2026Updated last month
- FastAPI Implementation of Orpheus TTS streaming Chatbot☆27Jun 19, 2025Updated 7 months ago
- ☆106Jun 2, 2025Updated 8 months ago
- Korean Translation Benchmark, LLM-as-a-judge☆23Oct 23, 2025Updated 3 months ago
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Jun 3, 2024Updated last year
- Performs benchmarking on two Korean datasets with minimal time and effort.☆44Jan 22, 2026Updated 3 weeks ago
- ☆20Jan 27, 2024Updated 2 years ago
- ☆19May 6, 2023Updated 2 years ago
- Evaluate gpt-4o on CLIcK (Korean NLP Dataset)☆20May 18, 2024Updated last year
- High performance pytorch modules☆18Jan 14, 2023Updated 3 years ago
- Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE☆19Sep 22, 2021Updated 4 years ago
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆218Jun 24, 2025Updated 7 months ago
- Hugging Face RoBERTa with Flash Attention 2☆24Sep 14, 2025Updated 5 months ago
- ☆18Apr 4, 2022Updated 3 years ago
- ☆57Jan 26, 2025Updated last year