Ermlab / polish-gec-datasetsLinks
Polish datsets for grammatical error correction
☆12Updated last year
Alternatives and similar repositories for polish-gec-datasets
Users that are interested in polish-gec-datasets are comparing it to the libraries listed below
Sorting:
- Pre-trained models and language resources for Natural Language Processing in Polish☆342Updated last year
- Embeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polis…☆36Updated last year
- A python package for benchmarking interpretability techniques on Transformers.☆213Updated 8 months ago
- This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish☆13Updated last year
- RoBERTa models for Polish☆87Updated 3 years ago
- Evaluation of Sentence Representations in Polish☆22Updated 2 years ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆197Updated last year
- ☆50Updated 2 years ago
- just a bunch of useful embeddings for scikit-learn pipelines☆500Updated 3 months ago
- Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service☆53Updated 4 months ago
- SpanMarker for Named Entity Recognition☆434Updated 5 months ago
- Instruct-tune LLaMA on consumer hardware☆21Updated 2 years ago
- A versatile and powerful library designed to streamline the process of querying LLMs☆85Updated last week
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆231Updated 7 months ago
- deep learning with pytorch lightning☆1Updated 7 months ago
- ☆78Updated last year
- HerBERT is a BERT-based Language Model trained on Polish Corpora using only MLM objective with dynamic masking of whole words.☆67Updated 3 years ago
- ☆78Updated last year
- Efficiently find the best-suited language model (LM) for your NLP task☆124Updated 3 weeks ago
- List of resources, libraries and more for developers who would like to build with open-source machine learning off-the-shelf☆199Updated last year
- Highly commented implementations of Transformers in PyTorch☆136Updated last year
- Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.☆26Updated last year
- A library to synthesize text datasets using Large Language Models (LLM)☆152Updated 2 years ago
- The robust European language model benchmark.☆106Updated this week
- A Scandinavian Benchmark for sentence embeddings☆39Updated last month
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆207Updated last month
- Generalist and Lightweight Model for Text Classification☆134Updated 2 weeks ago
- The most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selec…☆16Updated last year
- QA Bot for Hugging Face documentation to accelerate development within the ecosystem.☆43Updated last year
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…☆23Updated last year