harmening / signature_extraction
💬NLP - Library for splitting email content into a human-written body and an automatically appended signature.
☆25Updated 6 years ago
Alternatives and similar repositories for signature_extraction:
Users that are interested in signature_extraction are comparing it to the libraries listed below
- Crawl sites for RSS, Atom, and JSON feeds.☆73Updated 10 months ago
- Text analysis for automatic bookmarking/keyword extraction☆18Updated 8 years ago
- API - extract a list of keywords from a text.☆18Updated 7 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆56Updated last year
- This is a proof-of-concept of using an LLM to find and extract meaningful data without parsing the html too much.☆29Updated last year
- NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to …☆36Updated 2 years ago
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆149Updated 2 months ago
- email dataset for email signature parsing☆55Updated 8 years ago
- This script fetches search queries and excludes those that have a negative sentiment.☆10Updated 5 years ago
- Pre-built Scrapy spiders for AutoExtract☆19Updated 11 months ago
- clustering news, extract trending news stories☆12Updated 3 years ago
- ☆86Updated last year
- Integrate Watson Studio and Watson Campaign Automation to tailor your target audience for effective campaigns☆12Updated 3 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- Using Natural Language Processing to standardize Company Names☆12Updated 3 years ago
- Source real estate prices from the Common Crawl.☆27Updated 6 years ago
- A focused web crawler that uses Machine Learning to fetch better relevant results.☆13Updated 6 years ago
- A Google Trends Analytics Package☆13Updated 9 months ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- Demo using Google Forms, Cloud Natural Language, Google Sheets, and Apps Script to analyze vacation rental reviews☆40Updated 5 years ago
- Web Crawlers orchestration framework that lets you create datasets from multiple web sources using yaml configurations.☆34Updated last year
- An analysis of abilities, skills and tech skills data from the O*NET database as well as classification of around 500 random LinkedIn job…☆18Updated 4 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Grounding LLMs in truth with under 30 lines of code.☆20Updated last year
- Matches a category of Google's Taxonomy to product that is described in any kind of text data☆61Updated 6 years ago
- A ruby gem to extract structured data from Google Local Search Results using the serpapi/bert-base-local-results model, enabling parsing,…☆18Updated last year
- Python client for the Google Page Speed Insights analysis API.☆10Updated 5 years ago
- SEMRush SERP Tutorial. Using advertools to Extract and Analyze Search Engine Results Pages Data☆14Updated 6 years ago
- Google Cloud Storage connector, pre-processor and model for predicting user search intent based on keywords☆25Updated 5 years ago
- [archived]☆18Updated 3 years ago