Flexible evaluation tool for language models
☆58Updated this week
Alternatives and similar repositories for flexeval
Users that are interested in flexeval are comparing it to the libraries listed below
Sorting:
- DIRECT: Direct and Indirect REsponses in Conversational Text Corpus☆17Jul 1, 2021Updated 4 years ago
- The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.☆125Nov 13, 2025Updated 3 months ago
- DefSent: Sentence Embeddings using Definition Sentences☆22Aug 5, 2021Updated 4 years ago
- ☆29Apr 10, 2025Updated 10 months ago
- ☆19May 23, 2024Updated last year
- Discovering Universal Geometry in Embeddings with ICA (Published in EMNLP 2023)☆20Jun 17, 2025Updated 8 months ago
- The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)☆84Jan 6, 2026Updated last month
- Japanese instruction data (日本語指示データ)☆24Jul 13, 2023Updated 2 years ago
- 生成自動評価を行うためのPythonツール☆38Updated this week
- Preferred Generation Benchmark☆92Oct 28, 2025Updated 4 months ago
- Training and evaluation scripts for JGLUE, a Japanese language understanding benchmark☆18Feb 2, 2026Updated 3 weeks ago
- ☯️ AllenNLP training configurations for promising models on Named Entity Recognition. (BiLSTM-CRF, BiLSTM-CNN-CRF, BERT, BERT-CRF)☆15Nov 26, 2020Updated 5 years ago
- ☆16Nov 19, 2023Updated 2 years ago
- Repository for JSICK☆45May 31, 2023Updated 2 years ago
- Use custom tokenizers in spacy-transformers☆16Aug 9, 2022Updated 3 years ago
- Swallowプロジェクト 事後学習済み大規模言語モデル 評価フレームワーク☆25Oct 20, 2025Updated 4 months ago
- Code for "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" (NAACL 2022)☆110May 14, 2025Updated 9 months ago
- Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.☆89Nov 3, 2023Updated 2 years ago
- ☆35Dec 17, 2020Updated 5 years ago
- Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"☆120Oct 6, 2025Updated 4 months ago
- JMED-LLM: Japanese Medical Evaluation Dataset for Large Language Models☆56Sep 22, 2024Updated last year
- 🚀 A demonstration of hyperparameter optimization using Optuna for models implemented with AllenNLP.☆16Nov 28, 2020Updated 5 years ago
- 📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information☆131Mar 15, 2023Updated 2 years ago
- Whisper of the arxiv: read comments in tex of papers☆33May 16, 2018Updated 7 years ago
- ⚡️ AllenNLP plugin for adding subcommands to use Optuna, making hyperparameter optimization easy☆32Nov 23, 2021Updated 4 years ago
- 【2023年版】BERTによるテキスト分類☆235May 28, 2024Updated last year
- Python library to use and implement packages in OptunaHub☆54Updated this week
- ボケて電笑戦 (bokete DENSHOSEN) Workshop☆43May 16, 2022Updated 3 years ago
- Japanese-BPEEncoder☆41Sep 12, 2021Updated 4 years ago
- JGLUE: Japanese General Language Understanding Evaluation☆335Mar 31, 2025Updated 11 months ago
- 🐎 Colt: Effortlessly configure and construct Python objects with colt, a lightweight library inspired by AllenNLP and Tango☆26Dec 6, 2025Updated 2 months ago
- An annotation tool for grounding of formulae☆24May 28, 2024Updated last year
- LaTeX document class for the proceedings of ANLP☆21Oct 28, 2025Updated 4 months ago
- ☆10Sep 14, 2022Updated 3 years ago
- A language server implementation for pysen☆10Nov 14, 2021Updated 4 years ago
- The official code respository for "Rethinking the role of frames for SE(3)-invariant crystal structure modeling" (ICLR 2025)☆13Oct 16, 2025Updated 4 months ago
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- msglm makes it a little easier to create messages for language models like Claude and OpenAI GPTs.☆14Jan 29, 2026Updated last month
- AJIMEE-Bench (Advanced Japanese IME Evaluation Benchmark)☆18Jan 13, 2025Updated last year