Simple Scrapy middleware to process non-well-formed HTML with BeautifulSoup
☆21Sep 26, 2016Updated 9 years ago
Alternatives and similar repositories for scrapy-beautifulsoup
Users that are interested in scrapy-beautifulsoup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- mongo docker with auth☆12Jul 24, 2018Updated 7 years ago
- Restrict crawl and scraping scope using matchers.☆26Jun 8, 2016Updated 9 years ago
- Paginating the web☆37Feb 11, 2014Updated 12 years ago
- A library to make it easier to load input URLs to start scrapy processes☆14Feb 21, 2021Updated 5 years ago
- A Scrapy extension to log items coverage when the spider shuts down☆19Apr 11, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- helper module to export data from a relational database to a graph database (through CSV files)☆48Dec 11, 2013Updated 12 years ago
- Extract text from HTML☆135Feb 10, 2026Updated last month
- A middleware layer for Scrapy that detects CAPTCHA tests and solves them☆44Jul 6, 2023Updated 2 years ago
- Sample projects showcasing Scrapinghub tech☆137Feb 14, 2024Updated 2 years ago
- Random User-Agent middleware based on fake-useragent☆689Sep 18, 2023Updated 2 years ago
- Use pyppeteer from a Scrapy spider☆59Feb 5, 2020Updated 6 years ago
- ESLint rules for Protractor☆51Dec 30, 2022Updated 3 years ago
- A CLI for dealing with the features of ScrapingHub☆16Apr 20, 2021Updated 4 years ago
- ☆11Oct 5, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- 🎀 A Chrome extension written using Vue and Async/Await. Uses a popup display and changes badge counts.☆14Oct 28, 2024Updated last year
- Scrapes a given Facebook user's feed for messages, tags, likes, and datetimes of submissions.☆10Jul 3, 2013Updated 12 years ago
- A rotating socks proxy using Tor, Delegate and Haproxy☆14Feb 10, 2026Updated last month
- Repo of student materials for the General Assembly Data Science Course☆13Sep 30, 2016Updated 9 years ago
- Proxy-list management application for Django☆23Mar 5, 2018Updated 8 years ago
- Database-driven way to put your Django site into maintenance mode.☆42Oct 29, 2022Updated 3 years ago
- Sentry component for Scrapy☆86Aug 21, 2023Updated 2 years ago
- List of libraries, tools and APIs for web scraping and data processing.☆13Sep 17, 2015Updated 10 years ago
- Pricing European and American options with jump models using CUDA on the GPU☆12Apr 12, 2016Updated 9 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A linter for Scrapy projects.☆21Feb 25, 2026Updated last month
- A Python implementation of SCHEMA - An Algorithm for Automated Product Taxonomy Mapping in E-commerce.☆16Feb 3, 2015Updated 11 years ago
- General purpose pre-commit hooks used by BestDoctor for Python projects.☆12Jan 18, 2022Updated 4 years ago
- A library of nice to have things not found in the current mojo stdlib☆14Feb 4, 2026Updated last month
- Easy `inlets` client execution.☆12Jun 6, 2020Updated 5 years ago
- Python library to work with proxy server items loaded from local file or network document.☆17Dec 21, 2022Updated 3 years ago
- Chrome proxy extension☆12Feb 13, 2018Updated 8 years ago
- An example project for configuring Djcelery with Flask application and dynamically changing tasks via REST API and through django admin☆13Jan 6, 2022Updated 4 years ago
- Сайт кинотеатра с возможностью оставлять комментарии к сайту, каждому фильму, просмотром трейлеров к каждому фильму, удобной покупки/брон…☆11Dec 21, 2017Updated 8 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Firmware for Nitrokey FIDO U2F☆21Jan 18, 2021Updated 5 years ago
- A GitHub Action that lints Python code with Flake8 then automatically creates pull request reviews if there are any violations.☆27Apr 20, 2022Updated 3 years ago
- Convert Javascript code to an XML document☆187Mar 14, 2022Updated 4 years ago
- Attempting to create a program capable of combining stereo video input , with motors and other sensors on a PC running linux , the targe…☆23Apr 23, 2018Updated 7 years ago
- 📮 A server that provides you an infinite number of mailboxes that you can check via a REST API☆14Mar 25, 2022Updated 4 years ago
- Fast mass dns resolver☆20Jul 23, 2018Updated 7 years ago
- Scrapy spider middleware to split an item into multiple items using a multi-valued key☆21Feb 8, 2017Updated 9 years ago