KeremZaman / semantic-shLinks
semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).
☆28Updated last year
Alternatives and similar repositories for semantic-sh
Users that are interested in semantic-sh are comparing it to the libraries listed below
Sorting:
- Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for impr…☆53Updated 2 years ago
- State-of-the-art NLP through transformer models in a modular design and consistent APIs.☆47Updated 2 years ago
- Streamlit demo app to demonstrate the features of transformers interpret with multiple models.☆25Updated 4 years ago
- BERT, LDA, and TFIDF based keyword extraction in Python☆76Updated last week
- Notebooks for docarray, Jina, Finetuner, and other products from Jina AI☆12Updated 3 years ago
- Creating class-based TF-IDF matrices☆91Updated 3 years ago
- An open-source NLP library: fast text cleaning and preprocessing☆23Updated 4 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆70Updated 5 months ago
- Model for learning document embeddings along with their uncertainties☆36Updated 2 years ago
- Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation☆99Updated last year
- NeatText a simple NLP package for cleaning textual data and text preprocessing☆75Updated 2 years ago
- ☆16Updated 5 years ago
- Accurate word segmentation for hashtags and text, powered by Transformers and Beam Search. A scalable alternative to heuristic splitters …☆76Updated 3 weeks ago
- Custom Natural Language Processing with big and small models 🌲🌱☆66Updated 4 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 3 years ago
- semantically distinct key phrase extraction using hilbert hashes.☆50Updated 3 years ago
- Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.☆318Updated last year
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated last year
- XAI based human-in-the-loop framework for automatic rule-learning.☆49Updated last year
- KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extraction…☆105Updated last year
- A PyPI package for easy text annotation in a Jupyter Notebook.☆29Updated 4 years ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆79Updated 3 years ago
- Interactive tree-maps with SBERT & Hierarchical Clustering (HAC)☆30Updated last year
- Developing a Knowledge Graph-based Question and Answering program to extract information from huge dataset☆95Updated 2 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Updated 2 years ago
- Few-shot Named Entity Recognition☆121Updated 3 years ago
- Package that returns a company embedding given a company name☆49Updated 5 years ago
- ☆43Updated 2 years ago
- Python text processing, pattern matching, and NLP framework☆67Updated 2 years ago
- Pyinfer is a model agnostic tool for ML developers and researchers to benchmark the inference statistics for machine learning models or f…☆24Updated 4 years ago