d6t / d6tstackLinks
Quickly ingest messy CSV and XLS files. Export to clean pandas, SQL, parquet
☆196Updated 2 years ago
Alternatives and similar repositories for d6tstack
Users that are interested in d6tstack are comparing it to the libraries listed below
Sorting:
- A web frontend for scheduling Jupyter notebook reports☆254Updated last year
- python automatic data quality check toolkit☆278Updated 5 years ago
- Fuzzy joins for python pandas - easily join different datasets☆59Updated 5 years ago
- sidetable builds simple but useful summary tables of your data☆392Updated 3 years ago
- The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common fun…☆218Updated 4 years ago
- Tough and flexible tools for data analysis, transformation, validation and movement.☆140Updated 2 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆115Updated this week
- 🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & …☆215Updated 2 years ago
- Utilities for creating ETL pipelines with mara☆36Updated 3 years ago
- SQL GUI for JupyterLab☆430Updated 3 years ago
- An example mini data warehouse for python project stats, template for new projects☆178Updated 5 years ago
- A validation library for Pandas data frames using user-friendly schemas☆193Updated 2 years ago
- 🐍💨 Airflow tutorial for PyCon 2019☆87Updated 3 years ago
- SQL upsert using pandas DataFrames for PostgreSQL, SQlite and MySQL with extra features☆234Updated 2 years ago
- Push and pull data files like code☆175Updated 2 years ago
- Test-Driven Data Analysis Functions☆303Updated last week
- Extend pandas to_sql function to perform multi-threaded, concurrent "insert or update" command in memory☆85Updated last year
- Accelerate data science☆118Updated 4 years ago
- scaffold of Apache Airflow executing Docker containers☆85Updated 3 years ago
- dagster scikit-learn pipeline example.☆46Updated 2 years ago
- A flexible template for doing reproducible data science in Python.☆111Updated last year
- HandySpark - bringing pandas-like capabilities to Spark dataframes☆197Updated 6 years ago
- Bulwark is a package for convenient property-based testing of pandas dataframes.☆226Updated 5 years ago
- Apache Avro <-> pandas DataFrame☆137Updated 5 months ago
- A library for recording and reading data in notebooks.☆294Updated 3 years ago
- ☆113Updated last year
- JupyterHub extension for ContainDS Dashboards☆201Updated last year
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆84Updated 4 months ago
- High-level wrapper around BCP for high performance data transfers between pandas and SQL Server. No knowledge of BCP required!!☆136Updated last week
- Random dataframe and database table generator☆310Updated 4 years ago