mkahn5 / translate-book
☆16Updated 3 months ago
Related projects: ⓘ
- Stylometric framework in Python☆13Updated 9 years ago
- A Flask webapp that categorizes Outlook emails using machine learning☆15Updated 9 years ago
- Automatically exported from code.google.com/p/guess-language☆53Updated 7 months ago
- Python package for converting xml and epubs to text files☆34Updated 4 years ago
- Web scraper for indeed job search to reveal the data scientist required skills keywords☆36Updated 7 years ago
- A GoodReads.com Scraper script to get books reviews including text and rating.☆39Updated 2 years ago
- Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents☆12Updated 2 years ago
- Introduction to Topic Modeling for TextXD 2019, 12/3/2019☆10Updated 4 years ago
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated last year
- A web application for exploring documents topically.☆26Updated 8 years ago
- Presentations on Quantified Self and Self-Tracking with Python☆29Updated last year
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- ☆10Updated 8 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆31Updated last year
- A financial disclosure data extraction tool.☆13Updated last year
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- ☆12Updated 5 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆12Updated 3 years ago
- Some convenient natural language tools that build on NLTK.☆85Updated 10 years ago
- Orange Data Mining Homepage☆16Updated 4 years ago
- The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques☆29Updated 4 years ago
- Dump of generated texts from GPT-2 trained on /r/legaladvice subreddit titles☆23Updated 5 years ago
- A PDFMiner wrapper to ease the text extraction from pdf files.☆25Updated 11 years ago
- Attempts to determine the natural language of a selection of Unicode (utf-8) text (a clone of http://code.google.com/p/guess-language wit…☆47Updated 14 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆36Updated 5 years ago
- Markdown -> IPython conversion tool☆15Updated 9 years ago
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Updated 10 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 3 years ago
- A maximum-strength name parser for record linkage.☆29Updated last month
- Crawling and analyzing data on Wikipedia☆16Updated 6 months ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated last year