CLARIN-PL / LEPISZCZE
This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish
☆13Updated last year
Alternatives and similar repositories for LEPISZCZE:
Users that are interested in LEPISZCZE are comparing it to the libraries listed below
- Embeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polis…☆36Updated last year
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆55Updated 6 months ago
- A python package for benchmarking interpretability techniques on Transformers.☆213Updated 4 months ago
- Bi-encoder entity linking architecture☆44Updated 5 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆175Updated last month
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆116Updated 2 months ago
- Generalist and Lightweight Model for Text Classification☆79Updated this week
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated 2 years ago
- LTG-Bert☆29Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆54Updated 2 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆89Updated last year
- ☆155Updated 8 months ago
- RaKUn 2.0 - A fast keyword detection algorithm☆65Updated this week
- Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.☆25Updated last year
- Pre-train Static Word Embeddings☆47Updated 3 weeks ago
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP☆21Updated last year
- Code for SaGe subword tokenizer (EACL 2023)☆22Updated 2 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆67Updated 2 years ago
- ☆65Updated last year
- ☆67Updated 6 months ago
- RoBERTa models for Polish☆86Updated 2 years ago
- ☆46Updated last year
- ☆84Updated 9 months ago
- Evaluation of Sentence Representations in Polish☆22Updated 2 years ago
- Official implementation of "GPT or BERT: why not both?"☆47Updated 3 weeks ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆76Updated last year
- ☆96Updated 2 years ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆105Updated 10 months ago