A command line tool to cluster html pages based on structural and style similarity.
☆20Jan 13, 2026Updated 3 months ago
Alternatives and similar repositories for html-cluster
Users that are interested in html-cluster are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple heuristic for measuring web page similarity (& data set)☆91Apr 8, 2026Updated last month
- Compare html similarity using structural and style metrics☆218May 11, 2023Updated 2 years ago
- A toolkit for clustering web pages based on various similarity measures.☆34Oct 27, 2021Updated 4 years ago
- An efficient approximation for tree edit-distance.☆45Sep 6, 2011Updated 14 years ago
- JQGram tree edit distance approximation, Javascript port of PyGram with some additional functionality☆56Dec 1, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Basilica client for R☆11Dec 8, 2022Updated 3 years ago
- Repository for the Fall 2020 Computational Social Science Workshop☆13Nov 15, 2020Updated 5 years ago
- Diffs arbitrary HTML inline☆28Mar 12, 2018Updated 8 years ago
- A python library detect and extract listing data from HTML page.☆109May 5, 2017Updated 9 years ago
- Reference implementation for measuring linguistic cultural distances between individuals and groups.☆14Aug 7, 2019Updated 6 years ago
- Package for heterogeneous causal effects in the presence of imperfect compliance (e.g., instrumental variables, fuzzy regression disconti…☆18Mar 6, 2024Updated 2 years ago
- Query the 'PublicWWW' Source Code Search Engine in R☆13May 2, 2018Updated 8 years ago
- This is an exercise for a react workshop☆10Jun 29, 2016Updated 9 years ago
- Generate a redirect map from two sitemaps for website migration.☆13May 4, 2018Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ICD-10 nternational Statistical Classification of Diseases and Related Health Problems - Ground Truth and some Experimental R Code for N…☆14May 14, 2018Updated 7 years ago
- Sample Crawler for Data Day Seattle☆10Jun 27, 2015Updated 10 years ago
- ☆17Jul 15, 2022Updated 3 years ago
- bk-tree for golang☆11Jul 30, 2022Updated 3 years ago
- Little time-series forecasting app for fun! More models/methods will be included after the june 15! Link: jasonliushiny.shinyapps.io/Forc…☆14Nov 8, 2016Updated 9 years ago
- Collect hashes password hashes for cracking☆32Oct 22, 2013Updated 12 years ago
- Conversas e palestras da comunidade☆12Aug 22, 2020Updated 5 years ago
- Vacances scolaires en France☆16Mar 26, 2026Updated last month
- Hotel data scraper from Google Maps service☆11Jul 6, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A python3 module that converts your bs4 Tag into json object (dict)☆16Mar 17, 2026Updated last month
- A small collection of FFMPEG tools which I use while working on Gooey☆15May 28, 2025Updated 11 months ago
- A high performance event log as a service☆11Apr 24, 2017Updated 9 years ago
- A Foxx based geo example using the new (v3.4+) s2 geospatial index☆11Feb 12, 2024Updated 2 years ago
- Google Hacking Database☆10Jan 11, 2019Updated 7 years ago
- pornhub.com crawler to crawl and download videos those are publicly present in the website for viewing and downloading☆11Oct 1, 2020Updated 5 years ago
- OCR an image and get a word cloud☆13Feb 10, 2020Updated 6 years ago
- My contribution to the tabs versus spaces and salary debate☆30Jun 21, 2017Updated 8 years ago
- Python 3 implementation and documentation of the Hermina-Janos local graph clustering algorithm.☆24Jan 22, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- sketching algorithms implemented in chapel and python☆10Jun 8, 2017Updated 8 years ago
- ☆19Sep 5, 2013Updated 12 years ago
- Analysis of NBA player stats and salaries of the 2016-17 for the 17-18 season☆10Aug 10, 2017Updated 8 years ago
- This is the web UI for [Colly](https://github.com/gocolly/colly).☆11Apr 10, 2019Updated 7 years ago
- Sentence generation system for evaluating composition, described in Ettinger et al. (2018) "Assessing Composition in Sentence Vector Repr…☆16Apr 25, 2020Updated 6 years ago
- TextMate support for Make☆24Jan 9, 2023Updated 3 years ago
- Shiny app for anomaly detection using AnomalyDetection package.☆11Jul 15, 2019Updated 6 years ago