Wittline / csv-schema-inferenceLinks
A tool to automatically infer columns data types in .csv files
☆37Updated 2 years ago
Alternatives and similar repositories for csv-schema-inference
Users that are interested in csv-schema-inference are comparing it to the libraries listed below
Sorting:
- Opinionated JSON to CSV/XLSX/SQLITE/PARQUET converter. Flattens JSON fast.☆204Updated 6 months ago
- dagster scikit-learn pipeline example.☆46Updated 2 years ago
- Set up a Cost-Effective Modern Data Stack for a Charity☆19Updated 9 months ago
- ☆80Updated 2 years ago
- A playground for running duckdb as a stateless query engine over a data lake☆217Updated last year
- Demo Project for Open Source MDS☆169Updated 4 months ago
- An example of a Dagster project with a possible folder structure to organize the assets, jobs, repositories, schedules, and ops. Also has…☆101Updated last year
- Repo for orienting dbt users to the Dagster asset framework☆56Updated 3 years ago
- Playground for using large language models into the Modern Data Stack for entity matching☆108Updated 2 years ago
- Possibly the fastest DataFrame-agnostic quality check library in town.☆233Updated 2 months ago
- Anomstack - Painless open source anomaly detection for your metrics 📈📉🚀☆107Updated last month
- Write python locally, execute SQL in your data warehouse☆269Updated 3 years ago
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables☆74Updated 2 years ago
- ☆82Updated 3 months ago
- A template repository with all the fundamentals needed to develop and deploy a Python data-processing routine for Prefect pipelines.☆20Updated 3 years ago
- Example Dagster Cloud code for the Hooli Data Engineering organization.☆20Updated 2 months ago
- Jupyter Cell / Line Magics for DuckDB☆55Updated 2 months ago
- DuckDB SQL Tools add DuckDB support to VSCode, and provide database schema and SQL query interfaces for the popular SQLTools extension, S…☆19Updated last year
- This repo contains information about DuckDB extensions found on GitHub. Refreshed daily☆107Updated this week
- A curated list of dagster code snippets for data engineers☆56Updated last year
- 📦 Serverless and local-first Open Data Platform☆304Updated 2 weeks ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- Cost Efficient Data Pipelines with DuckDB☆60Updated 7 months ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆51Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last month
- ☆158Updated last month
- A curated collection of helpful SQL queries and functions, maintained by Count.☆208Updated 4 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆31Updated 2 years ago
- Sample configuration to deploy a modern data platform.☆89Updated 4 years ago
- Flatten/Explode JSON objects☆21Updated 7 months ago