🤖 Scrape data from HTML websites automatically by just providing examples
☆1,385Mar 17, 2024Updated 2 years ago
Alternatives and similar repositories for mlscraper
Users that are interested in mlscraper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Smart, Automatic, Fast and Lightweight Web Scraper for Python☆7,297Jun 9, 2025Updated last year
- 🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬…☆3,713Jun 1, 2026Updated 3 weeks ago
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XM…☆6,203Updated this week
- 🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell …☆7,241Oct 31, 2023Updated 2 years ago
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pac…☆298May 19, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Python scraper based on AI☆27,473Jun 23, 2026Updated last week
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆24,227Updated this week
- Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架☆12,227Feb 10, 2026Updated 4 months ago
- dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators☆425Mar 16, 2025Updated last year
- Scrapyd on container infrastructure☆16May 29, 2026Updated last month
- Low-code platform allows you to build business apps, enables you to quickly create internal tools such as dashboard, crud app, admin pane…☆12,280May 27, 2026Updated last month
- AI-based web extractor☆12Feb 25, 2023Updated 3 years ago
- AI based web-wrapper for web-content-extraction☆102Feb 6, 2023Updated 3 years ago
- 神奇的蜘蛛🕷,一个几乎适用于所有web端站点的采集方案☆349Aug 23, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 新闻网页正文通用抽取器 Beta 版.☆3,781Apr 21, 2026Updated 2 months ago
- App to easily query, script, and visualize data from every database, file, and API.☆2,955Nov 10, 2023Updated 2 years ago
- Python实用教程,包括:Python基础,Python高级特性,面向对象编程,多线程,数据库,数据科学,Flask,爬虫开发教程。☆2,444May 1, 2023Updated 3 years ago
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,683Jun 22, 2026Updated last week
- 👻 Experimental library for scraping websites using OpenAI's GPT API.☆1,444Jan 14, 2026Updated 5 months ago
- playwright stealth☆968Jul 29, 2024Updated last year
- List of libraries, tools and APIs for web scraping and data processing.☆267Mar 12, 2026Updated 3 months ago
- A collection of pipelines for Scrapy☆16Apr 27, 2026Updated 2 months ago
- 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and mor…☆27,782Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Python version of the Playwright testing and automation library.☆14,776Updated this week
- List of libraries, tools and APIs for web scraping and data processing.☆7,948May 28, 2026Updated last month
- 坚持分享 GitHub 上高质量、有趣实用的开源技术教程、开发者工具、编程网站、技术资讯。A list cool, interesting projects of GitHub.☆46,872Dec 31, 2025Updated 6 months ago
- 用文本编辑器剪视频☆7,747Oct 5, 2024Updated last year
- Auto Extractor Module☆338Aug 19, 2024Updated last year
- 🔥 🔥 🔥 A Free & Self-hostable Airtable Alternative☆63,651Updated this week
- A browser extension for automating your browser by connecting blocks☆21,443Mar 2, 2026Updated 3 months ago
- Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powere…☆23,247Apr 29, 2025Updated last year
- GUI-based Python code generator for data science, extension to Jupyter Lab, Jupyter Notebook and Google Colab.☆917Jul 3, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- An intelligent web service to automatically detect web content and extract information from it.☆86Jul 13, 2023Updated 2 years ago
- 🕸️ Web apps in pure Python 🐍☆28,610Updated this week
- The web scraping open project repository aims to share knowledge and experiences about web scraping with Python☆1,725May 27, 2024Updated 2 years ago
- HTTP proxy with per-request uTLS fingerprint mimicry and upstream proxy tunneling. Currently WIP.☆54Jan 14, 2024Updated 2 years ago
- Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand …☆1,412Feb 4, 2026Updated 4 months ago
- Turn (almost) any Python command line program into a full GUI application with one line☆21,897Mar 23, 2026Updated 3 months ago
- newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:☆15,081May 13, 2026Updated last month