harmening / signature_extractionLinks
💬NLP - Library for splitting email content into a human-written body and an automatically appended signature.
☆26Updated 6 years ago
Alternatives and similar repositories for signature_extraction
Users that are interested in signature_extraction are comparing it to the libraries listed below
Sorting:
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- Text analysis for automatic bookmarking/keyword extraction☆18Updated 8 years ago
- Matches a category of Google's Taxonomy to product that is described in any kind of text data☆62Updated 6 years ago
- Now included in rigour☆151Updated 2 months ago
- A zero-shot relation extractor, easily downloadable from the HuggingFace repo.☆12Updated 3 years ago
- Source real estate prices from the Common Crawl.☆27Updated 6 years ago
- Entity resolution, also known as Data Matching or Record linkage is the task of finding a data set that refer to the same or similar real…☆24Updated 3 months ago
- LexPredict ContraxSuite document samples☆23Updated 7 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆59Updated this week
- A financial disclosure data extraction tool.☆16Updated last year
- Latent Semantic Analysis Introduction: An information retrieval technique patented in 1988. In the context of its application to inform…☆16Updated 8 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- A GPT powered CLI tool that answers questions about your data☆98Updated 2 years ago
- A maximum-strength name parser for record linkage.☆37Updated last month
- Integrate Watson Studio and Watson Campaign Automation to tailor your target audience for effective campaigns☆12Updated 3 years ago
- Customer Due Diligence - Automated Google Web Scraping for Negative News☆12Updated 6 years ago
- email dataset for email signature parsing☆55Updated 9 years ago
- Parsing resumes in a PDF format from linkedIn☆68Updated 8 years ago
- Crawl sites for RSS, Atom, and JSON feeds.☆76Updated last year
- This script uses an ensemble of multiple methods: RAKE, TF-IDF and Automatic Keyword Extraction to obtain top keywords in Reddit posts. P…☆12Updated 8 years ago
- A text processing tool including tag(HTML, URL, Email) extraction and removing, punctuation normalization, simple segmentation, and so on…☆11Updated 7 months ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆45Updated 6 years ago
- A Flask webapp that categorizes Outlook emails using machine learning☆15Updated 9 years ago
- ☆11Updated 3 years ago
- Aiohttp web server API, which scrapes Google and returns scrape results as response. Supports proxies, multiple geos and number of result…☆57Updated last year
- Building a Job Dataset☆22Updated 3 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆86Updated last week
- NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to …☆35Updated 3 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 9 months ago