LouisTsiattalou / tfidf_matcher
TFIDF / KNN based string matching
☆51Updated last year
Alternatives and similar repositories for tfidf_matcher:
Users that are interested in tfidf_matcher are comparing it to the libraries listed below
- Super Fast String Matching in Python☆363Updated this week
- Fuzzy matching and more functionality for spaCy.☆255Updated 6 months ago
- Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning☆308Updated 3 months ago
- A Python library for calculating a large variety of metrics from text☆324Updated last month
- All the goto functions you need to handle NLP use-cases, integrated in NLPretext☆139Updated 10 months ago
- spaCy-wrap is a wrapper library for spaCy for including fine-tuned transformers from Huggingface in your spaCy pipeline allowing you to i…☆46Updated 9 months ago
- A Corpus of 475,000 Industrial Occupations☆63Updated 4 years ago
- Nesta's Skills Extractor Library☆126Updated 2 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 8 months ago
- Sensible multi-core apply function for Pandas☆79Updated 3 weeks ago
- Dataframe Integration with spaCy.☆103Updated 3 years ago
- PYthon Automated Term Extraction☆310Updated last year
- A Flexible Deep Learning Approach to Fuzzy String Matching☆140Updated 3 months ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Updated last year
- 🧪 Cutting-edge experimental spaCy components and features☆96Updated 9 months ago
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆64Updated last month
- Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks☆157Updated last year
- Name matching is a Python package for the matching of company names. This package has been developed to match the names of companies from…☆141Updated this week
- Google USE (Universal Sentence Encoder) for spaCy☆182Updated last year
- Creating class-based TF-IDF matrices☆82Updated 2 years ago
- ☆32Updated 3 years ago
- Simplifies use of the Dedupe library via Pandas☆135Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆117Updated 9 months ago
- Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document …☆179Updated 11 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆57Updated last year
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆120Updated 9 months ago
- ☆35Updated 3 years ago
- Package that returns a company embedding given a company name☆42Updated 4 years ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆257Updated 2 months ago
- SKILLSPAN: Competences as Spans for Skill Extraction from Job Postings☆56Updated 11 months ago