Modularly extensible semantic metadata validator
☆85Dec 10, 2015Updated 10 years ago
Alternatives and similar repositories for schemato
Users that are interested in schemato are comparing it to the libraries listed below
Sorting:
- An online sentiment analyzer built with Flask and TextBlob☆15Sep 3, 2013Updated 12 years ago
- Manage and load dataprotocols.org Data Packages☆27Sep 17, 2015Updated 10 years ago
- Large RDF hierarchies as vector spaces☆20Jun 27, 2014Updated 11 years ago
- Find which links on a web page are pagination links☆29Jan 12, 2017Updated 9 years ago
- Python implementation of the Parsley language for extracting structured data from web pages☆92Oct 26, 2017Updated 8 years ago
- Fast structured perceptron sequential labeler☆15Dec 8, 2015Updated 10 years ago
- Django feeds provides an extensive database model for RSS feeds and a fault tolerant parser.☆30Jun 14, 2012Updated 13 years ago
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆41May 29, 2017Updated 8 years ago
- Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even whe…☆55May 21, 2024Updated last year
- Learn how to construct graphs given representative examples☆14Jul 6, 2021Updated 4 years ago
- Akara is an open-source (Apache2 license) Web framework specialized for RESTful data services, especially involving XML and other semi-st…☆25Dec 19, 2013Updated 12 years ago
- A platform for tools that do stuff with data☆56Feb 14, 2019Updated 7 years ago
- Detect and classify pagination links☆15Sep 9, 2020Updated 5 years ago
- mltk - Moz Language Tool Kit☆12Mar 6, 2015Updated 11 years ago
- OMGeocoder - A python geocoding abstraction layer☆37Apr 19, 2024Updated last year
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆93Oct 20, 2025Updated 4 months ago
- Asynchronous HTTP client built on top of Crochet and Twisted☆20Mar 3, 2021Updated 5 years ago
- Python module for extracting information from web pages☆41Jun 12, 2014Updated 11 years ago
- Seki is middleware/a front-end for connecting to an independent SPARQL server using node.js☆36Jan 11, 2015Updated 11 years ago
- Windfarm Operations & Maintenance cost-Benefit Analysis Tool☆28Feb 10, 2026Updated 3 weeks ago
- Metric based in-memory circuit breaker for python☆23Feb 6, 2017Updated 9 years ago
- Feed discovery to share :)☆41Oct 28, 2016Updated 9 years ago
- Crochet-based blocking API for Scrapy.☆46Feb 24, 2017Updated 9 years ago
- Seed acquisition tool to bootstrap focused crawlers☆23Apr 24, 2017Updated 8 years ago
- Commit Log as a Service (zhCN)☆20Feb 3, 2026Updated last month
- Intelligent RSS news aggregator.☆33Oct 20, 2023Updated 2 years ago
- OpenBlock is a web application and RESTful service that allows users to browse and search their local area for "hyper-local news☆61Jun 10, 2021Updated 4 years ago
- feedparser but faster and worse☆104Sep 28, 2021Updated 4 years ago
- Scrapy Eagle is a tool that allow us to run any Scrapy based project in a distributed fashion and monitor how it is going on and how many…☆24Sep 4, 2020Updated 5 years ago
- Neddick: Open Source Information Discovery Platform☆36Mar 15, 2023Updated 2 years ago
- Navigating around a grid of cells like XPath for spreadsheets; supports Python 3.5+☆48Feb 1, 2023Updated 3 years ago
- ☆36Nov 7, 2023Updated 2 years ago
- Jabba's headless webkit browser for scraping AJAX-powered webpages.☆90Oct 23, 2014Updated 11 years ago
- A very naive classifier to figure out if a sentence contains dirty words☆33Jul 7, 2015Updated 10 years ago
- ☆11May 25, 2021Updated 4 years ago
- Cloud Mining automatically builds exploratory faceted search systems.☆52Oct 15, 2013Updated 12 years ago
- The goal of this experiment is to take articles and certain metadata and group them by topic.☆11Apr 14, 2016Updated 9 years ago
- Probabilistic Data Structures in Python (originally presented at PyData 2013)☆55Jan 6, 2022Updated 4 years ago
- Computation Graph framework implemented using only NumPy☆10Mar 31, 2024Updated last year