xissy / node-boilerpipe
A node.js wrapper for Boilerpipe, an excellent Java library for boilerplate removal and fulltext extraction from HTML pages.
☆52Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for node-boilerpipe
- Friendly web crawler for x-ray☆44Updated last year
- A simple-but-useful kNN library for NodeJS, comparing JSON Objects using Euclidean distances☆215Updated 9 years ago
- Node.js module to extract and summarize html content☆42Updated 10 years ago
- Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they a…☆41Updated 7 years ago
- Streaming uploads to Amazon Web Service(AWS) S3 for NodeJS☆78Updated 2 years ago
- A simple node.js wrapper for stanford-core-nlp.☆148Updated 7 years ago
- phantom driver for x-ray.☆111Updated 8 years ago
- A 2nd generation spider to crawl any article site, automatic read title and article.☆43Updated 8 years ago
- NLP utilities in javascript and coffeescript☆37Updated 10 years ago
- Create thumbnails from images, video, audio and web pages.☆123Updated 8 years ago
- A helper robot written in node javascript☆74Updated 12 years ago
- Martin Porter's stemmer for node.js☆100Updated 4 years ago
- NodeJS Named Entity Recognition, using Stanford NER (easy install)☆40Updated 7 years ago
- Node library to extract keywords from text☆58Updated 9 years ago
- remote monitoring and debugging for socket.io☆450Updated 9 years ago
- fetch & parse ATOM & RSS feeds with Node.js☆74Updated 5 years ago
- js utility for summarizing large bodies of text using a basic sentence relevance ranking algorithm☆101Updated 8 years ago
- A simple node.js wrapper for Stanford CoreNLP.☆75Updated 2 years ago
- ExtractContent for node.js☆15Updated 5 years ago
- ☆27Updated 6 years ago
- A minimalistic Disque client using modern Node.js.☆52Updated 8 years ago
- A PredictionIO 0.9+ client☆60Updated 6 years ago
- node.js wrapper for the Diffbot API (article and frontpage)☆35Updated 8 years ago
- tools for working with Princeton's lexical database WordNet☆74Updated 6 years ago
- Node npm for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, an array with all…☆129Updated 5 years ago
- Simple ACL for Sequelize☆32Updated 7 years ago
- Extract the content of any web page by using various content extractor libraries.☆10Updated 8 years ago