allegro / klejbenchmark-baselines
Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.
☆25Updated last year
Alternatives and similar repositories for klejbenchmark-baselines:
Users that are interested in klejbenchmark-baselines are comparing it to the libraries listed below
- Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good mode…☆34Updated 3 years ago
- Evaluation of Sentence Representations in Polish☆22Updated 2 years ago
- RoBERTa models for Polish☆86Updated 2 years ago
- Tool for named entity recognition for Polish based on deep learning.☆30Updated last year
- ☆50Updated 2 years ago
- Embeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polis…☆36Updated last year
- A Python library aimed at dissecting and augmenting NER training data.☆57Updated last year
- Code and data accompanying the paper "Approaching nested named entity recognition with parallel LSTM-CRFs."☆26Updated 2 years ago
- This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish☆13Updated last year
- Label data using HuggingFace's transformers and automatically get a prediction service☆180Updated last year
- COMBO is jointly trained tagger, lemmatizer and dependency parser.☆36Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 7 months ago
- Ongoing research training transformer language models at scale, including: BERT☆15Updated 5 years ago
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆80Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated last week
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated last year
- Performance evaluation of nearest neighbor search using Vespa, Elasticsearch and Open Distro for Elasticsearch K-NN☆116Updated 3 years ago
- GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning☆27Updated 3 years ago
- ☆87Updated 2 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆187Updated 3 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 10 months ago
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging☆65Updated 2 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆98Updated last year
- Shared BERT model for 4 languages of Bulgarian, Czech, Polish and Russian. Slavic NER model.☆73Updated 2 years ago
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated 2 years ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆102Updated 2 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆64Updated last month
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert…☆48Updated 3 years ago