akhvorov / S3M
S3M: Siamese Stack (Trace) Similarity Measure
☆11Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for S3M
- This project focuses on DeepER, a deep learning framework for entity resolution (record deduplication). It examines how DeepER performs o…☆45Updated 6 years ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆147Updated 2 years ago
- Annotated corpus + evaluation metrics for text anonymisation☆51Updated 9 months ago
- Fuzzy matching and more functionality for spaCy.☆252Updated 4 months ago
- Toolkit to help understand "what lies" in word embeddings. Also benchmarking!☆469Updated last year
- Weakly Supervised End-to-End Learning (NeurIPS 2021)☆153Updated last year
- 🐦 Quickly annotate data from the comfort of your Jupyter notebook☆275Updated last year
- Sentence transformers models for SpaCy☆105Updated last year
- Super Fast String Matching in Python☆364Updated 6 months ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆192Updated last year
- SpikeX - SpaCy Pipes for Knowledge Extraction☆398Updated 3 years ago
- Repository for performing Blocking using Deep Learning based on the paper "Deep Learning for Blocking in Entity Matching: A Design Space …☆30Updated last year
- ☆32Updated 3 years ago
- ☆185Updated 5 months ago
- A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently…☆106Updated 2 months ago
- Self-Supervision for Named Entity Disambiguation at the Tail☆214Updated 2 years ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆242Updated last year
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆209Updated 5 months ago
- Training from scratch a character embedding following Word2Vec, using tensorflow.☆14Updated last year
- Röttger et al. (ACL 2021): "HateCheck: Functional Tests for Hate Speech Detection Models" - Data☆56Updated 2 years ago
- [DEPRECATED] Adapt Transformer-based language models to new text domains☆86Updated 9 months ago
- Toolkit for Auditing and Mitigating Bias and Fairness of Machine Learning Systems 🔎🤖🧰☆95Updated last year
- ☆15Updated 5 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 5 months ago
- ☆81Updated 2 years ago
- Open source no-code system for text annotation and building of text classifiers☆251Updated 2 months ago
- Spacy NER annotator using ipywidgets☆121Updated 7 months ago
- PassivePy: A Tool to Automatically Identify Passive Voice in Big Text Data☆17Updated 8 months ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆920Updated 2 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆80Updated 2 years ago