edx / pa11ycrawlerLinks
Python crawler (using Scrapy) that uses Pa11y to check accessibility of pages as it crawls.
☆18Updated 6 years ago
Alternatives and similar repositories for pa11ycrawler
Users that are interested in pa11ycrawler are comparing it to the libraries listed below
Sorting:
- Scrapy downloader middleware that stores response HTMLs to disk.☆18Updated 3 months ago
- Template for creating a scraper that saves to Google Sheets, fires Slack notifications, and is scheduled using AWS Lambda and CloudWatch☆10Updated 6 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- API - extract a list of keywords from a text.☆18Updated 8 years ago
- legacy backend for Open States☆87Updated 5 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆58Updated last year
- Scrapy project with spiders to extract article content from various german news sites☆21Updated 12 years ago
- extract difference between two html pages☆32Updated 7 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆63Updated 2 months ago
- Scrape email-addresses from a user-provided domain☆20Updated 7 years ago
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆92Updated 2 weeks ago
- Chrome extension to allow in-browser decryption of Ansible vaults☆11Updated 4 years ago
- Tools for tracking stories on news homepages☆48Updated 6 years ago
- a client side transcriptions text editor to proofread and correct the text before re-alignement back on the server.☆19Updated 7 years ago
- Social Feed Manager user interface application.☆156Updated last year
- framework for scraping legislative/government data☆88Updated last year
- Ideas for (tech) stuff to research, build or work on.☆50Updated 10 months ago
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆99Updated 3 years ago
- Automatic Item List Extraction☆87Updated 9 years ago
- Trough: Big data, small databases.☆40Updated last year
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Updated 2 years ago
- Paginating the web☆37Updated 11 years ago
- Primary LocalWiki backend server environment☆47Updated 7 years ago
- An online annotation platform for teaching and learning in the humanities.☆108Updated 2 months ago
- A classifier for detecting soft 404 pages☆56Updated last month
- Python bot that crawls your website looking for dead stuff☆43Updated 3 years ago
- The news homepage archive☆80Updated 4 years ago
- A library for extracting tables from PDF files☆89Updated 12 years ago
- Data validation as a service. Project retired, got to the current one at frictionsless/repository☆69Updated 2 years ago
- Solr Relevance Ranking Analysis and Visualization Tool☆15Updated 6 years ago