edx / pa11ycrawler
Python crawler (using Scrapy) that uses Pa11y to check accessibility of pages as it crawls.
☆17Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for pa11ycrawler
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- Spell correct entire sentences using nltk freqdist and symspell☆19Updated 7 years ago
- Find which links on a web page are pagination links☆29Updated 7 years ago
- Automated NLP sentiment predictions- batteries included, or use your own data☆18Updated 6 years ago
- Create, edit and use hierarchical taxonomies in Plone!☆19Updated 2 weeks ago
- Aho-Corasick string replacement utility☆23Updated 4 years ago
- Akvo Really Simple Reporting☆39Updated last week
- Tools to manipulate and extract data from wikipedia dumps☆45Updated 11 years ago
- Markdown -> IPython conversion tool☆15Updated 9 years ago
- Dump of generated texts from GPT-2 trained on /r/legaladvice subreddit titles☆23Updated 5 years ago
- extract difference between two html pages☆32Updated 6 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 9 years ago
- Slides to learn a little natural language processing (NLP) with Python. Written in reST with S5/Docutils.☆28Updated 12 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 8 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆56Updated 9 months ago
- Python package to detect and return RSS / Atom feeds for a given website. The tool supports major blogging platform including Wordpress, …☆21Updated 3 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 7 years ago
- Measure is scripts and conventions to build KPI dashboards for projects.☆17Updated 4 years ago
- Scrapy extension which writes crawled items to Kafka☆30Updated 6 years ago
- Integration between Reaction ECommerce and Accelerated Text to provide product descriptions for an e-shop.☆9Updated 3 years ago
- A search engine for Open Data☆53Updated last year
- 🌆 TouristFriend API lets you query Google Places, Yelp and Foursquare at the same time, with Bayesian rankings!☆29Updated 5 years ago
- An online sentiment analyzer built with Flask and TextBlob☆15Updated 11 years ago
- Techniques for Scraping the Web in Python☆25Updated 6 years ago
- Minimum Entropy is a DDL hosted question/answer site for beginners who need answers to Data Science questions.☆16Updated 8 years ago
- An attempt at creating a silver/gold standard dataset for backtesting yesterday & today's content-extractors☆34Updated 9 years ago
- Search engine base (crawler, indexer and parser) using Python, Celery, RabbitMQ, CouchDB and Whoosh.☆11Updated last year
- NLP crowdsourcing platform for word-level annotations☆11Updated 5 years ago