henchc / web-scrapersLinks
various web scrapers as examples
☆17Updated 4 years ago
Alternatives and similar repositories for web-scrapers
Users that are interested in web-scrapers are comparing it to the libraries listed below
Sorting:
- How to build an end to end search engine using elasticsearch and angularjs☆26Updated 6 years ago
- Word2Vec encodings based search engine for Stackoverflow questions☆26Updated 2 years ago
- Python package for converting xml and epubs to text files☆34Updated 4 years ago
- Spell correct entire sentences using nltk freqdist and symspell☆19Updated 7 years ago
- Collection of Jupyter notebooks for downloading Twitter data☆24Updated 8 years ago
- Text summarization using spacy☆22Updated 2 years ago
- OSoMe API mashups☆11Updated 6 years ago
- This project is created to promote and advocate the use of FOSS machine learning.☆46Updated last month
- Burglary prediction for mortals☆10Updated last year
- ☆31Updated 2 years ago
- Aho-Corasick string replacement utility☆24Updated 5 years ago
- Python, Tor, Stem, Privoxy: with this tools, allow requests new connections via Tor for obtain new IP addresses.☆24Updated 6 years ago
- Simple duckduckgo results scraping☆68Updated 7 years ago
- Extracting LinkedIn comments from any post and export it to Excel file☆23Updated 6 years ago
- A curated list of ML awesome frameworks & libraries for text data☆16Updated 2 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 4 years ago
- Text summarization algorithm for the Capstone Project at Springboard code bootcamp☆54Updated 2 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 7 years ago
- A selection of business datasets☆18Updated 5 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated 3 weeks ago
- NLP text recommendation system built in Python using Gensim, spaCy, and Plotly Dash☆15Updated 7 years ago
- Chorus, now for Elasticsearch!☆16Updated 11 months ago
- Scripts as a service. Builds on systemd (for Linux)☆20Updated last year
- an app that makes your personalized newsletter based on your bookmarks☆11Updated 7 years ago
- An eBook tool to extract ISBN or Metadata form eBook and rename them by using ISBN database and Metadata☆30Updated 9 years ago
- A fully customisable language detection pipeline for spaCy☆92Updated 6 years ago
- Extract social media links and account names from websites.☆38Updated 4 years ago
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- Chatlytics is a data query and visualization platform for chat!☆13Updated 8 years ago
- Collaboration app for sharing and reviewing jupyter notebooks☆16Updated last week