edx / pa11ycrawler
Python crawler (using Scrapy) that uses Pa11y to check accessibility of pages as it crawls.
☆17Updated 5 years ago
Alternatives and similar repositories for pa11ycrawler:
Users that are interested in pa11ycrawler are comparing it to the libraries listed below
- Find which links on a web page are pagination links☆29Updated 8 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 7 years ago
- This is a REST Server endpoint built using Flask and Python.☆24Updated 2 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 3 years ago
- extract difference between two html pages☆32Updated 6 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆34Updated 8 years ago
- Spell correct entire sentences using nltk freqdist and symspell☆19Updated 7 years ago
- Aho-Corasick string replacement utility☆24Updated 5 years ago
- A project to demonstrate maximum entropy models for extracting quotes from news articles in Python.☆25Updated 12 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Automated NLP sentiment predictions- batteries included, or use your own data☆18Updated 7 years ago
- Python binding for gumbo-parser using Cython☆14Updated 8 years ago
- ☆18Updated 8 years ago
- iCQA - Intelligent Community Question Answering Framework☆32Updated 8 years ago
- Scrapy pipeline which allows you to store scrapy items in a solr server.☆19Updated 8 years ago
- ☆16Updated 8 years ago
- a Simple API for RDF☆29Updated 15 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 9 years ago
- GHRecommender - personalized recommendations for GitHub projects based on information about repositories starred by the user☆26Updated 2 years ago
- Efficiently search the most similar strings against the query in Python.☆18Updated 6 years ago
- Demo of the Newspaper article extraction library.☆29Updated 10 years ago
- High Level Kafka Scanner☆19Updated 7 years ago
- NLP crowdsourcing platform for word-level annotations☆11Updated 5 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 3 years ago
- Easy language identification of 380 languages☆18Updated 5 years ago
- Seamless HTML table extraction for Python☆20Updated 8 years ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 2 years ago
- Deep learning certificate part 1☆10Updated 2 years ago
- Search engine base (crawler, indexer and parser) using Python, Celery, RabbitMQ, CouchDB and Whoosh.☆11Updated last year