A dataset of popular pages (taken from <dir.yahoo.com>) with manually marked up semantic blocks.
☆15Feb 9, 2014Updated 12 years ago
Alternatives and similar repositories for dataset-popular
Users that are interested in dataset-popular are comparing it to the libraries listed below
Sorting:
- Tools for web page segmentation. In development☆17Nov 7, 2018Updated 7 years ago
- Failover AWS Spot Instances☆11Dec 8, 2017Updated 8 years ago
- A distributed in-memory fabric based on shared-memory blocks and datashape. Any language can operate on the data.☆13Feb 12, 2016Updated 10 years ago
- Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum☆18Jul 1, 2022Updated 3 years ago
- Classifies webpages into categories defined in DMOZ dataset☆40Dec 14, 2015Updated 10 years ago
- Agent fixing SWE bench issues☆19May 21, 2024Updated last year
- Scalable pattern search optimization with dask☆22Apr 12, 2017Updated 8 years ago
- a series of trie testing things☆21Apr 9, 2017Updated 8 years ago
- Scrapy Eagle is a tool that allow us to run any Scrapy based project in a distributed fashion and monitor how it is going on and how many…☆24Sep 4, 2020Updated 5 years ago
- A simple CRUD wrapper around Amazon DynamoDB☆24Sep 24, 2019Updated 6 years ago
- The Clever Algorithms project is an effort to describe a large number of algorithmic techniques from the field of Artificial Intelligence…☆29Oct 28, 2018Updated 7 years ago
- Participate in the 4th U.S. National Action Plan for Open Government☆13Jun 8, 2018Updated 7 years ago
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆41May 29, 2017Updated 8 years ago
- This library facilitates creating OpenAPI (Swagger) document for Python projects.☆12Jan 4, 2021Updated 5 years ago
- 🌩️ The Deep Learning framework based on Lightning☆11Dec 11, 2025Updated 2 months ago
- openapi of all third-party☆10Updated this week
- ☆10Jun 24, 2020Updated 5 years ago
- ICEG: Thematic Working Groups☆11Feb 19, 2026Updated 2 weeks ago
- ☆12Sep 22, 2015Updated 10 years ago
- A CLI for benchmarking Scrapy.☆32Jun 28, 2025Updated 8 months ago
- The Linked GTFS vocabulary☆39Mar 20, 2022Updated 3 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Dec 17, 2021Updated 4 years ago
- Structured Data Extractor. An application to extract structured data from web pages. It uses Data Extraction Based on Partial Tree Alignm…☆49Jun 9, 2012Updated 13 years ago
- A starting Python-Flask web app template with accompanying guide☆12Jan 18, 2025Updated last year
- ☆11Jul 20, 2021Updated 4 years ago
- Configuration system geared towards Python ML projects☆11Apr 30, 2023Updated 2 years ago
- FFI-based byte buffers for Idris☆10Jun 21, 2019Updated 6 years ago
- A sandbox for opensource demonstrations of GitHub☆14Apr 13, 2016Updated 9 years ago
- Application for checking performance of elevator group system in building using simulation method.☆12Nov 9, 2017Updated 8 years ago
- Faster replacement for Python's urlparse module☆45Sep 30, 2018Updated 7 years ago
- Incredible user-friendly seq2seq API and CLI app with beam search, bidirectional, attention, bucket in just one single file☆12Sep 16, 2018Updated 7 years ago
- Data on Digital Media and Technology Expenditures in the United States Congress☆10Jul 17, 2017Updated 8 years ago
- ☆14Aug 21, 2020Updated 5 years ago
- Enables one Django project to authenticate via a second Django project ***SEEKING CONTRIBUTORS***☆11May 25, 2022Updated 3 years ago
- #BEStartupManifesto website☆10May 9, 2015Updated 10 years ago
- Web-based IDE for Python, Scheme, and SQL intended for students taking CS 61A.☆11Dec 10, 2022Updated 3 years ago
- ☆12Nov 9, 2018Updated 7 years ago
- ☆13Feb 9, 2026Updated last month
- A micropub media endpoint written in Python using Flask and Flask-HashFS☆11Jun 9, 2023Updated 2 years ago