srijiths/readabilityBUNDLE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/srijiths/readabilityBUNDLE)

srijiths / readabilityBUNDLE

A bundle of html content extraction algorithms

☆121

Alternatives and similar repositories for readabilityBUNDLE

Users that are interested in readabilityBUNDLE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

karussell / snacktory
View on GitHub
Readability clone in Java
☆462Oct 13, 2020Updated 5 years ago
java10000 / semantic_similarity_based_on_ANN
View on GitHub
基于人工神经网络的中文语义相似度计算研究
☆11Apr 1, 2013Updated 13 years ago
kostyll / summary.js
View on GitHub
JS module for making short summary of some text
☆13Nov 3, 2014Updated 11 years ago
hfut-dmic / ContentExtractor
View on GitHub
自动抽取网页正文的算法，用JAVA实现
☆111Apr 18, 2017Updated 9 years ago
reorx / cx-extractor
View on GitHub
Automatically exported from code.google.com/p/cx-extractor
☆29Apr 1, 2015Updated 11 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
dragnet-org / dragnet
View on GitHub
Just the facts -- web page content extraction
☆1,274Jul 8, 2025Updated last year
robbypond / boilerpipe
View on GitHub
boilerpipe 1.2.2 - a fork from 1.2.0 with additional features
☆10Nov 2, 2016Updated 9 years ago
shenbaise / goodcrawler
View on GitHub
网络爬虫
☆50Mar 18, 2014Updated 12 years ago
kohlschutter / boilerpipe
View on GitHub
Work in progress transmit from Google Code
☆1,126Jan 3, 2018Updated 8 years ago
stanzhai / Html2Article
View on GitHub
Html网页正文提取
☆496May 9, 2022Updated 4 years ago
jiminoc / goose
View on GitHub
Html Content / Article Extractor in Scala - open sourced from Gravity Labs - http://gravity.com
☆341Aug 20, 2019Updated 6 years ago
tpopela / vips_java
View on GitHub
Implementation of Vision Based Page Segmentation algorithm in Java
☆107Oct 25, 2019Updated 6 years ago
fahimk / ReadIt
View on GitHub
a readability client for android
☆25Jan 23, 2012Updated 14 years ago
rodricios / eatiht
View on GitHub
An exercise in unsupervised machine learning: Extract Article's Text in HTml documents.
☆430Jan 16, 2026Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tomayac / wikipedia-live-monitor
View on GitHub
Wikipedia Live Monitor
☆22Dec 21, 2024Updated last year
bahn / wikitopics
View on GitHub
Exploits Wikipedia's daily view counts to find out what topics are current trends
☆17May 7, 2013Updated 13 years ago
evolvingstuff / SimpleLSTM
View on GitHub
A recurrent neural network heavily inspired by Long Short Term Memory, but simpler.
☆21May 4, 2013Updated 13 years ago
lgomez / eventsourced
View on GitHub
Event sourcing JavaScript entity class
☆11Apr 24, 2017Updated 9 years ago
skuenzli / time-series-analysis
View on GitHub
Contains tools for analyzing time-series data.
☆11May 8, 2013Updated 13 years ago
mdorn / proose
View on GitHub
A Prudence-based web services API for the Goose HTML content extraction library
☆38Jul 17, 2011Updated 15 years ago
heavysheep / webEYE
View on GitHub
对不同模板的静态网页，识别并提取正文、标题、时间等元素
☆15Dec 28, 2016Updated 9 years ago
marcoschwartz / node-aREST
View on GitHub
Node.js module for the aREST framework
☆11Sep 25, 2018Updated 7 years ago
erich-oliveira-winnin / example-studio-microservices
View on GitHub
Example os studio microservices
☆11May 12, 2016Updated 10 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
realm-js / realm-js
View on GitHub
☆13Mar 1, 2024Updated 2 years ago
bookieio / breadability
View on GitHub
Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
☆205May 9, 2024Updated 2 years ago
kxtells / vague-places
View on GitHub
☆14Dec 24, 2016Updated 9 years ago
bestguy / redux-ractive-qlock
View on GitHub
Redux and RactiveJS example
☆10Mar 8, 2016Updated 10 years ago
amumu-dev / cx-extractor
View on GitHub
clone of https://code.google.com/p/cx-extractor
☆37Sep 26, 2013Updated 12 years ago
sarendipitee / ractive-datepicker
View on GitHub
A datepicker component for RactiveJs
☆10Jun 26, 2018Updated 8 years ago
NLPchina / SinaMicroBlogCrawl
View on GitHub
新浪微博模拟登陆2014-04-01版
☆21Apr 1, 2014Updated 12 years ago
silverbucket / activity-streams.js
View on GitHub
THIS REPO HAS BEEN MOVED TO https://github.com/sockethub/sockethub - a simple tool to facilitate handling and referencing activity stream…
☆11Dec 30, 2019Updated 6 years ago
subchen / jetbrick-template-2x-samples
View on GitHub
Samples for jetbrick-template-2x
☆10Mar 17, 2017Updated 9 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
thingSoC / thingSoC
View on GitHub
thingSoC - Open Source Sockets for the Internet of Things
☆16Oct 29, 2016Updated 9 years ago
badaozhai / wechat_webdriver_spider
View on GitHub
java 基于selenium抓取搜狗微信公众号文章
☆50Nov 16, 2015Updated 10 years ago
tomfaulhaber / geo-window
View on GitHub
Simple spatio-temporal windowing in Kafka Streams
☆13Jul 14, 2016Updated 10 years ago
mleoking / LeoTask
View on GitHub
Lightweight-Productive-Reliable parallel task running and results aggregation (MapReduce on multicore)
☆10Jun 10, 2018Updated 8 years ago
nakfoury / TwInfluence
View on GitHub
Generates visualizations of influential tweets about a given hashtag.
☆11Jun 1, 2017Updated 9 years ago
chenkai1100 / SpiderFrame
View on GitHub
分布式网络爬虫架构
☆16Sep 26, 2016Updated 9 years ago
luin / readability
View on GitHub
📚 Turn any web page into a clean view
☆2,521Apr 3, 2021Updated 5 years ago