Simple Scrapy middleware to process non-well-formed HTML with BeautifulSoup
☆21Sep 26, 2016Updated 9 years ago
Alternatives and similar repositories for scrapy-beautifulsoup
Users that are interested in scrapy-beautifulsoup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A simple ticket-tracking app that can easily be dropped into a new or existing Django site.☆10May 2, 2017Updated 8 years ago
- Find which links on a web page are pagination links☆29Jan 12, 2017Updated 9 years ago
- Scrapy middleware which allows to crawl only new content☆79Apr 8, 2026Updated last week
- Tool to flatten stream of JSON-like objects, configured via schema☆33Oct 19, 2019Updated 6 years ago
- Restrict crawl and scraping scope using matchers.☆26Jun 8, 2016Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Paginating the web☆37Feb 11, 2014Updated 12 years ago
- A Scrapy pipeline to categorize items using MonkeyLearn☆38Apr 28, 2017Updated 8 years ago
- A Scrapy extension to log items coverage when the spider shuts down☆19Apr 11, 2020Updated 6 years ago
- helper module to export data from a relational database to a graph database (through CSV files)☆48Dec 11, 2013Updated 12 years ago
- a tor socks proxy docker image☆12Apr 8, 2026Updated last week
- Extract text from HTML☆135Apr 8, 2026Updated last week
- A middleware layer for Scrapy that detects CAPTCHA tests and solves them☆43Jul 6, 2023Updated 2 years ago
- Sample projects showcasing Scrapinghub tech☆137Feb 14, 2024Updated 2 years ago
- Random User-Agent middleware based on fake-useragent☆689Sep 18, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Use pyppeteer from a Scrapy spider☆59Feb 5, 2020Updated 6 years ago
- A CLI for dealing with the features of ScrapingHub☆16Apr 20, 2021Updated 4 years ago
- bingx api doc☆34Aug 5, 2023Updated 2 years ago
- 🎀 A Chrome extension written using Vue and Async/Await. Uses a popup display and changes badge counts.☆14Oct 28, 2024Updated last year
- DEPRECATED Export Members of a Facebook Group to a CSV☆13Jun 30, 2020Updated 5 years ago
- A rotating socks proxy using Tor, Delegate and Haproxy☆14Apr 8, 2026Updated last week
- Database-driven way to put your Django site into maintenance mode.☆42Oct 29, 2022Updated 3 years ago
- Simple library for storing Scrapy Items in sqlite database☆12Jan 28, 2016Updated 10 years ago
- List of libraries, tools and APIs for web scraping and data processing.☆13Sep 17, 2015Updated 10 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A linter for Scrapy projects.☆22Feb 25, 2026Updated last month
- A Python implementation of SCHEMA - An Algorithm for Automated Product Taxonomy Mapping in E-commerce.☆16Feb 3, 2015Updated 11 years ago
- General purpose pre-commit hooks used by BestDoctor for Python projects.☆12Jan 18, 2022Updated 4 years ago
- Python library to work with proxy server items loaded from local file or network document.☆18Dec 21, 2022Updated 3 years ago
- Chrome proxy extension☆12Feb 13, 2018Updated 8 years ago
- A job scraper using the Scrapy framework☆16Oct 20, 2017Updated 8 years ago
- A very simple mobile-friendly game that teaches CSS selectors.☆29Dec 20, 2022Updated 3 years ago
- A RabbitMQ Scheduler for Scrapy☆87Aug 9, 2022Updated 3 years ago
- An example project for configuring Djcelery with Flask application and dynamically changing tasks via REST API and through django admin☆13Jan 6, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Сайт кинотеатра с возможностью оставлять комментарии к сайту, каждому фильму, просмотром трейлеров к каждому фильму, удобной покупки/брон…☆11Dec 21, 2017Updated 8 years ago
- A GitHub Action that lints Python code with Flake8 then automatically creates pull request reviews if there are any violations.☆27Apr 20, 2022Updated 3 years ago
- Convert Javascript code to an XML document☆188Mar 14, 2022Updated 4 years ago
- 📮 A server that provides you an infinite number of mailboxes that you can check via a REST API☆14Mar 25, 2022Updated 4 years ago
- Python clients for Zyte AutoExtract API☆41Jan 17, 2022Updated 4 years ago
- Fast mass dns resolver☆20Jul 23, 2018Updated 7 years ago
- Simple LinkedIn jobs crawler using Redis-based Scrapy☆17Mar 18, 2021Updated 5 years ago