jlettvin / SimilarLinks
A Python canonicalizer to disambiguate and recognize known names from a poor quality data entry list.
☆20Updated 9 years ago
Alternatives and similar repositories for Similar
Users that are interested in Similar are comparing it to the libraries listed below
Sorting:
- Extract postal addresses from the DOM☆66Updated 13 years ago
- Auto-transcribe your meetings to Slack in real time☆156Updated 5 years ago
- Language Lego☆141Updated 5 years ago
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆41Updated 8 years ago
- 🍻Uses Google, Yelp, and Foursquare APIs to retrieve and rank bars☆87Updated 8 years ago
- Rewriting web proxy and archival tool. At this point, it just tries to download all the things.☆202Updated this week
- A proof of concept using IBM's Speech-to-Text API to do quick-and-dirty transcriptions☆312Updated 9 years ago
- Compares two XML documents by diffing their text.☆42Updated last year
- A fast python scikit-learn text sentiment API server.☆89Updated 9 years ago
- Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)☆205Updated last year
- A smart bot for Slack that helps users stay on topic.☆54Updated 8 years ago
- Repository for PyCon 2016 workshop Natural Language Processing in 10 Lines of Code☆240Updated 8 years ago
- E-commerce scraping and analytics platform.☆53Updated 9 years ago
- remove signature blocks from emails☆86Updated 6 years ago
- Notes, boards and lists, templates and forms, tags and other tools for data driven note taking☆141Updated 2 years ago
- Convert text documents to high fidelity audio(books).☆205Updated 5 years ago
- Train your own Natural Language Processor from a browser 🤖 (Prototype)☆174Updated 2 years ago
- Searching for the occurrence seconds of words/phrases or arbitrary regex patterns within audio files☆102Updated 4 years ago
- Helps you extract CSV data tables from PDF files using the mighty tabula-java. See https://github.com/tabulapdf/tabula-java☆80Updated 6 years ago
- A microservice for archiving the news.☆164Updated 9 years ago
- A toolbox and web application for working with and presenting textual material from Shakespeare to Schopenhauer, and letters to literatur…☆149Updated 10 years ago
- Automatically extracts and normalizes an online article or blog post publication date☆117Updated 2 years ago
- A very naive classifier to figure out if a sentence contains dirty words☆33Updated 10 years ago
- Mechanical Turk on your own machine.☆207Updated 11 months ago
- An interactive map of Stack Exchange tags for all sites.☆126Updated 2 years ago
- A deliciously fast and simple Filesystem-Web-CMS☆35Updated 9 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- Automatic text summarization☆243Updated 6 years ago
- A library for extracting tables from PDF files☆92Updated 5 years ago
- Python binding to libpoppler with focus on text extraction☆97Updated 3 years ago