harshavardhana / boilerpipy
Readability/Boilerpipe extraction in Python
☆55Updated 8 years ago
Alternatives and similar repositories for boilerpipy:
Users that are interested in boilerpipy are comparing it to the libraries listed below
- Modularly extensible semantic metadata validator☆83Updated 9 years ago
- A very naive classifier to figure out if a sentence contains dirty words☆33Updated 9 years ago
- Suma, microservice to manage external links☆46Updated 7 years ago
- An extended version of the official Elasticsearch Python client.☆63Updated 9 years ago
- ☆53Updated 9 years ago
- A validated JSON manager and REST API generator for Python, Flask, and RethinkDB☆72Updated 7 years ago
- A polite, minimal interface for sending python objects to and from Amazon S3.☆57Updated 8 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 9 years ago
- RGP -- Redis Graph via Python☆30Updated 9 years ago
- Personal Site Flask based skeleton☆46Updated 4 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated 9 years ago
- Tweet Lake is a commandline interface to Twitter Streaming API and big data project that extracts interesting stats out of tweet corpus.☆20Updated 2 years ago
- High Level Kafka Scanner☆19Updated 7 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆34Updated 8 years ago
- Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)☆204Updated 8 months ago
- A high-performance distributed web crawling & scraping framework written with golang and python.☆30Updated 8 years ago
- A powerful analytics python library for Redis.☆36Updated 9 years ago
- Python driver for BedquiltDB☆19Updated 8 years ago
- Markov Bot based on bigram probabilities to generate tweets from your tweet history.☆21Updated 7 years ago
- Aviation grade news article metadata extraction☆36Updated last year
- Realtime semantic similarity visualization with gensim, d3.js, and hookbox☆40Updated 10 years ago
- Python3.5 async crawler example with aiohttp and asyncio☆147Updated 7 years ago
- Bringing sanity to world of messed-up data☆65Updated 10 years ago
- Find elements in HTML by matching them with a skeleton☆25Updated 2 years ago
- "Scrape Easy" - an extension of the Scrapy framework.☆188Updated 8 years ago
- Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…☆55Updated 8 months ago
- unofficial git mirror of http://svn.whoosh.ca svn repo☆49Updated 14 years ago
- Internal Stack Exchange☆26Updated 9 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆58Updated 3 years ago
- Invatar is a free service for generating fully customizable avatars with initials.☆163Updated 9 years ago