allegro / klejbenchmark-baselinesLinks
Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.
☆26Updated 2 years ago
Alternatives and similar repositories for klejbenchmark-baselines
Users that are interested in klejbenchmark-baselines are comparing it to the libraries listed below
Sorting:
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆187Updated 3 years ago
- RoBERTa models for Polish☆88Updated 3 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆41Updated 3 years ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆157Updated last year
- A library to synthesize text datasets using Large Language Models (LLM)☆152Updated 2 years ago
- The pipeline for the OSCAR corpus☆175Updated 3 weeks ago
- ☆11Updated 4 years ago
- MFAQ: a Multilingual FAQ Dataset☆18Updated 2 years ago
- LASER multilingual sentence embeddings as a pip package☆225Updated 2 years ago
- Simply, faster, sentence-transformers☆143Updated last year
- A multilingual version of MS MARCO passage ranking dataset☆144Updated 2 years ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆105Updated 3 years ago
- Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/☆251Updated last year
- [EMNLP-Findings 2020] Adapting BERT for Word Sense Disambiguation with Gloss Selection Objective and Example Sentences☆63Updated last year
- Open source library for few shot NLP☆78Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- Polish BERT☆72Updated 5 years ago
- ☆87Updated 7 months ago
- Code for the CRAC 2021 paper "On Generalization in Coreference Resolution" (Best short paper award)☆36Updated 2 years ago
- This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020…☆32Updated 4 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆189Updated 4 years ago
- A High-level Library for Named Entity Recognition in Python.☆25Updated last year
- Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.☆255Updated 3 years ago
- A collection of task-specific NLU datasets☆159Updated 3 years ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆47Updated 2 years ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer models☆64Updated 2 years ago
- Evaluation of Sentence Representations in Polish☆23Updated 2 years ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆61Updated 2 years ago
- Question-answers, collected from Google☆129Updated 4 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆59Updated 2 years ago