Toloka / crowd-kit
Control the quality of your labeled data with the Python tools you already know.
☆214Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for crowd-kit
- Toloka-Kit is a Python library for working with Toloka API.☆202Updated 4 months ago
- BSNLP 2021☆32Updated 2 weeks ago
- RuSimpleSentEval (RSSE) shared task repo☆21Updated 3 years ago
- Augmentex — a library for augmenting texts with errors☆52Updated 4 months ago
- Question answering on russian with XLMRobertaLarge as a service☆21Updated 3 years ago
- Tools for shrinking fastText models (in gensim format)☆173Updated 6 months ago
- Interface for easier topic modelling.☆139Updated 3 months ago
- Distillation of BERT model with catalyst framework☆75Updated last year
- A library built upon PyTorch for building embeddings on discrete event sequences using self-supervision☆91Updated 2 years ago
- A library built upon PyTorch for building embeddings on discrete event sequences using self-supervision☆223Updated this week
- RUSSE 2022: Russian Text Detoxification Based on Parallel Corpora☆20Updated 2 years ago
- A small library with distillation, quantization and pruning pipelines☆26Updated 3 years ago
- SAGE: Spelling correction, corruption and evaluation for multiple languages☆132Updated 2 months ago
- nlp workshop at datafest siberia 2019☆22Updated last year
- Probing suite for evaluation of Russian embedding and language models☆32Updated last month
- NEREL: A Russian Dataset with Nested Named Entities, Relations and Events☆25Updated last year
- Train punctuation and capitalization models for different languages☆24Updated 2 years ago
- "Rossiya Segodnya" news dataset☆45Updated 5 years ago
- Библиотека для извлечения статистик из текстов на русском языке.☆103Updated last year
- Pytorch library for end-to-end transformer models training, inference and serving☆70Updated 2 years ago
- Infrastructure for starting TG bot project. Postgres, Minio, Grafana, Alembic☆21Updated 2 years ago
- Курс по глубокому обучению в обработке естественных языков для магистров компьютерной лингвистики Высшей Школы Экономики☆47Updated 2 years ago
- Pipeline for fast building text classification TF-IDF + LogReg baselines.☆63Updated 3 years ago
- ☆12Updated 2 years ago
- A repository for Toloka tools.☆13Updated 5 months ago
- Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.☆126Updated 2 years ago
- Russian Corpus of Linguistic Acceptability☆41Updated last month
- A Russian data set for question answering over Wikidata☆46Updated 3 years ago
- Deep Learning for Speech☆80Updated last week