preligens-lab / textnoisrLinks
Adding random noise to a text dataset, and controlling very accurately the quality of the result
☆20Updated 2 months ago
Alternatives and similar repositories for textnoisr
Users that are interested in textnoisr are comparing it to the libraries listed below
Sorting:
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆212Updated last month
- Efficiently find the best-suited language model (LM) for your NLP task☆127Updated 3 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156Updated last year
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆44Updated last year
- Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13☆197Updated 2 months ago
- Robust and fast topic models with sentence-transformers.☆81Updated last week
- Notebooks for training universal 0-shot classifiers on many different tasks☆136Updated 10 months ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆165Updated 5 months ago
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆107Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆69Updated 2 years ago
- A Python Search Engine for Humans 🥸☆238Updated last year
- ☆115Updated 10 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆45Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆190Updated 5 months ago
- Generalist and Lightweight Model for Text Classification☆164Updated 4 months ago
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated last year
- Simply, faster, sentence-transformers☆143Updated last year
- Zero and Few shot named entity & relationships recognition☆391Updated last month
- ☆87Updated 5 months ago
- ☆169Updated last year
- A High-level Library for Named Entity Recognition in Python.☆24Updated last year
- Bi-encoder entity linking architecture☆51Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆110Updated last year
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognition☆15Updated 2 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆54Updated 2 years ago
- The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.☆24Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …☆338Updated 2 years ago
- A BERT-based application for reusable text classification at scale☆38Updated 2 years ago
- The robust European language model benchmark.☆134Updated this week
- ReFinED is an efficient and accurate entity linking (EL) system.☆223Updated 10 months ago