A curated list of promising Web Data Extractors resources
☆30Dec 24, 2019Updated 6 years ago
Alternatives and similar repositories for awesome-web-data-extractor
Users that are interested in awesome-web-data-extractor are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Feb 26, 2024Updated 2 years ago
- European Parliament website Python scraper☆12Oct 19, 2016Updated 9 years ago
- Highly concurrent and fast content processing for Mighty Inference Server☆10Feb 6, 2023Updated 3 years ago
- Template matching ocr using scanlines and templates. Accuracy more than 80%. Need to improve accuracy and small character recognition mor…☆18Oct 6, 2017Updated 8 years ago
- Scrapy spider to recursively crawl for TOR hidden services☆11Oct 12, 2017Updated 8 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A clone of the pinba extension for php, in pure php☆21Dec 13, 2022Updated 3 years ago
- Awesome list of the software tools related to opendata: data catalogs, ingestion tools, data prep tools and so on☆36Oct 28, 2025Updated 5 months ago
- Collections of Tools, Bookmarks, and other guides created to aid in OSINT collection☆18Aug 18, 2021Updated 4 years ago
- This API provides authentication and CRUD operations for data used by the Chronas application☆13Mar 16, 2026Updated 2 weeks ago
- Template based form extractor OCR. Train your own character and alphabet OCR.☆18Oct 22, 2018Updated 7 years ago
- Statamic v2 Addon to find locations using Google Maps autocomplete.☆13Mar 29, 2020Updated 6 years ago
- Self-hosted online diff tool. Alternative to diffchecker.com☆15Sep 24, 2024Updated last year
- A static site generator for TEI Publisher☆13Mar 8, 2022Updated 4 years ago
- ☆14Mar 21, 2026Updated last week
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Analogs of Linguistic Structure in Deep Representations☆19Jul 27, 2017Updated 8 years ago
- Datasette plugin for inserting and updating data☆20Mar 29, 2024Updated 2 years ago
- The installer provides an Ansible playbook for setting up CollectionSpace on an Ubuntu server.☆11Aug 26, 2023Updated 2 years ago
- Dataset for EMNLP'23 Paper "DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading"☆11Oct 25, 2023Updated 2 years ago
- Tools for European Parliament Data☆12Mar 10, 2026Updated 2 weeks ago
- A Google Colab for DFDNet: Blind Face Restoration☆12Aug 9, 2021Updated 4 years ago
- UI for extracting data from pdf files using watsonx prompts☆12Sep 18, 2025Updated 6 months ago
- ☆16Oct 8, 2025Updated 5 months ago
- Render a map for any query with a geometry column☆28Aug 10, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Command Line Interface for running 🤗 Transformers Image Classification locally☆19May 8, 2025Updated 10 months ago
- Make a searchable pdf via Google Cloud Vision OCR☆14Jan 17, 2020Updated 6 years ago
- Literary Language Toolkit: code, models, corpora, and web tools☆11Mar 28, 2024Updated 2 years ago
- Initiate the awesome keyword research with constant update with practical information gathered daily☆29Dec 14, 2017Updated 8 years ago
- A framework for creating digital exhibits by loading collection metadata directly from a CSV (such as a published Google Sheet!). See the…☆14Feb 20, 2026Updated last month
- Weaviate's own language vectorizer, which allows for semantic context-based searches in Weaviate☆17Jan 17, 2024Updated 2 years ago
- ☆19Mar 12, 2025Updated last year
- Omeka S Module for storing media in one of several cloud storage services☆13Jun 13, 2024Updated last year
- xhprof composer☆12Apr 6, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆19Apr 5, 2024Updated last year
- VIAF via Python☆13Updated this week
- Umbrella repository that describes the collections contained in any given release of ELTeC☆13Jan 26, 2022Updated 4 years ago
- Named Entity Disambiguation and Linking☆16May 24, 2024Updated last year
- Adds a reconciliation API endpoint to Datasette, based on the Reconciliation Service API specification.☆24Feb 2, 2024Updated 2 years ago
- Python package for downloading art and metadata of WikiArt and Google Arts & Culture☆17Apr 15, 2024Updated last year
- ☆20Jul 22, 2021Updated 4 years ago