edx / pa11ycrawler
Python crawler (using Scrapy) that uses Pa11y to check accessibility of pages as it crawls.
☆18Updated 5 years ago
Alternatives and similar repositories for pa11ycrawler:
Users that are interested in pa11ycrawler are comparing it to the libraries listed below
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 9 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Updated 10 years ago
- A whoosh-based CLI indexer and searcher for your files.☆16Updated 8 years ago
- An online sentiment analyzer built with Flask and TextBlob☆15Updated 11 years ago
- GHRecommender - personalized recommendations for GitHub projects based on information about repositories starred by the user☆26Updated 2 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 8 years ago
- Programmatically find and read labels using Machine Learning☆46Updated 6 years ago
- A project to demonstrate maximum entropy models for extracting quotes from news articles in Python.☆25Updated 12 years ago
- two strange things to do with neural nets☆16Updated 6 years ago
- This project has 3 goals: To find out the best machine learning pipeline for predicting ASD cases using genetic algorithms, via the TPOT …☆15Updated 4 years ago
- Paginating the web☆37Updated 11 years ago
- Webrecorders DevTools Protocol Automation Library☆17Updated 2 years ago
- Markdown -> IPython conversion tool☆15Updated 10 years ago
- Automated NLP sentiment predictions- batteries included, or use your own data☆18Updated 7 years ago
- A trend viewer written in Python/JavaScript☆21Updated 4 months ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 3 years ago
- Scraper built with Scrapy.☆15Updated 7 months ago
- Python package to detect and return RSS / Atom feeds for a given website. The tool supports major blogging platform including Wordpress, …☆21Updated 3 years ago
- Search engine base (crawler, indexer and parser) using Python, Celery, RabbitMQ, CouchDB and Whoosh.☆11Updated last year
- Use Pug.js within any python framework☆16Updated 3 years ago
- Simple Web UI for Scrapy spider management via Scrapyd☆51Updated 6 years ago
- Template for creating a scraper that saves to Google Sheets, fires Slack notifications, and is scheduled using AWS Lambda and CloudWatch☆10Updated 6 years ago
- A tool to allow US addresses to be geocoded/georeferenced easily, without using Python or the command line or paid services or anything.☆17Updated 2 years ago
- A platform for tools that do stuff with data☆56Updated 6 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 3 years ago
- Dump of generated texts from GPT-2 trained on /r/legaladvice subreddit titles☆23Updated 5 years ago
- Utilities for working with data.☆20Updated 9 years ago
- Analyze topics and trends in news with NLP☆49Updated 2 years ago
- Machine learning model to recommend related content☆19Updated last year