A dataset of popular pages (taken from <dir.yahoo.com>) with manually marked up semantic blocks.
☆15Feb 9, 2014Updated 12 years ago
Alternatives and similar repositories for dataset-popular
Users that are interested in dataset-popular are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Tools for web page segmentation. In development☆17Nov 7, 2018Updated 7 years ago
- A fork of http://pydispatcher.sourceforge.net/ with PyPy support☆16Jul 3, 2017Updated 8 years ago
- A distributed in-memory fabric based on shared-memory blocks and datashape. Any language can operate on the data.☆13Feb 12, 2016Updated 10 years ago
- Failover AWS Spot Instances☆11Dec 8, 2017Updated 8 years ago
- Web page segmentation and noise removal☆55Feb 4, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Classifies webpages into categories defined in DMOZ dataset☆39Dec 14, 2015Updated 10 years ago
- Agent fixing SWE bench issues☆19May 21, 2024Updated last year
- ⚙️ Generate Components.js component files from TypeScript☆14Apr 29, 2026Updated last week
- 🗿Stones: Persistent key-value containers, compatible with Python dict☆17Jul 15, 2024Updated last year
- Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum☆18Jul 1, 2022Updated 3 years ago
- Data science tools from Moz☆23Jan 11, 2017Updated 9 years ago
- A repository to work on the transmodel ontology that provides support to the NeTEx model☆11Feb 17, 2021Updated 5 years ago
- ☆13Feb 9, 2026Updated 3 months ago
- A CLI for benchmarking Scrapy.☆32Jun 28, 2025Updated 10 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A benchmark for Solid to simulate vaults with social network data.☆11Aug 29, 2025Updated 8 months ago
- A ternary search tree for Node.js☆11Feb 28, 2026Updated 2 months ago
- Ontology for GPX data in RDF☆15Mar 12, 2020Updated 6 years ago
- TwoFold (2✂︎f). Text files breathe fire.☆23Jan 28, 2026Updated 3 months ago
- ☆14Jun 27, 2019Updated 6 years ago
- Unofficial documentation of the NMBS/SNCB API☆13Apr 7, 2015Updated 11 years ago
- Deprecated! Use the rdf-connect/ldes-client instead☆14Mar 5, 2024Updated 2 years ago
- Scalable pattern search optimization with dask☆22Apr 12, 2017Updated 9 years ago
- mayktso: encounters at an endpoint☆20Jul 8, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- FFI-based byte buffers for Idris☆10Jun 21, 2019Updated 6 years ago
- PhD thesis: "Knowledge Graph Construction from Heterogeneous Data Sources exploiting Declarative Mapping Rules"☆14Mar 24, 2022Updated 4 years ago
- WhimApp TSP (Transport Service Provider) Open API☆17Apr 4, 2020Updated 6 years ago
- Implementation of Collin's perceptron for structured prediction☆16Mar 10, 2025Updated last year
- Kaggle competition results☆20Jan 4, 2019Updated 7 years ago
- 🔍 Code Search Tools & Experiments☆12Mar 1, 2026Updated 2 months ago
- ☆11Jul 20, 2021Updated 4 years ago
- A source for GTFS feed files available nowhere else. URLs are stable. To send in an updated archive or add a new feed, e-mail file to hel…☆29Mar 19, 2026Updated last month
- node.js client for nsq☆24Jan 9, 2017Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆13Dec 2, 2021Updated 4 years ago
- ☆10Jun 24, 2020Updated 5 years ago
- Web-based synthesis of nifty NLP and entity extraction services☆13Oct 25, 2019Updated 6 years ago
- a series of trie testing things☆21Apr 9, 2017Updated 9 years ago
- Our solution of the Kaggle Abstraction and Reasoning Challenge☆23May 30, 2020Updated 5 years ago
- Information about the CodedotAI reading group sessions.☆12Aug 16, 2021Updated 4 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Dec 17, 2021Updated 4 years ago