szelenka / prefect-webscraper-exampleLinks
Quick and dirty example of using Prefect Core to scrape a website
☆24Updated 5 years ago
Alternatives and similar repositories for prefect-webscraper-example
Users that are interested in prefect-webscraper-example are comparing it to the libraries listed below
Sorting:
- A maximum-strength name parser for record linkage.☆39Updated 4 months ago
- This repository is part of an article "Prefect workflow automation with Azure DevOps and AKS"☆30Updated 4 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆65Updated this week
- ☆16Updated last year
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆22Updated 4 years ago
- A browser user interface for manual labeling of record pairs.☆48Updated 2 years ago
- CLI for creating databases for Data Quality Dashboards.☆19Updated 6 years ago
- Python API for parsehub.com web scraping service☆46Updated 7 years ago
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆79Updated 4 years ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆36Updated 6 years ago
- A small Python module containing quick utility functions for standard ETL processes.☆37Updated 3 weeks ago
- 💾 Script to import issues from a JIRA instance into a database.☆57Updated 3 years ago
- ☆27Updated 3 weeks ago
- This is the code accompanying the blog article on makeitnew.io. It defines a Prefect flow which can be visualized, run locally or registe…☆29Updated 5 years ago
- Scraping Assisted by Learning☆36Updated 3 months ago
- A simple command line interface to the datamade/dedupe library.☆43Updated 3 years ago
- Now included in rigour☆152Updated last month
- GraphiPy: Universal Social Data Extractor☆82Updated 3 years ago
- Simple samples for writing ETL transform scripts in Python☆24Updated 2 weeks ago
- Postgres utility package for dbt (getdbt.com)☆19Updated 3 weeks ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 5 years ago
- Centralized whale instance using github actions, sourcing metadata from bigquery-public-data.☆17Updated last year
- A simple HTML table scraper made with Python and the amazing Streamlit!☆20Updated 2 years ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.☆58Updated 4 years ago
- Docker template for basic data science packages to interface with Neo4j☆14Updated 4 years ago
- data wrangling simplicity, complete audit transparency, and at speed☆35Updated 3 months ago
- Tools for working with Singer Taps and Targets☆61Updated last year
- Building a Job Dataset☆23Updated 3 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated last month