blackrock / xml_to_parquet
Convert one or more XML files into Apache Parquet format. Only requires a XSD and XML file to get started.
☆33Updated last year
Related projects ⓘ
Alternatives and complementary repositories for xml_to_parquet
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated 8 months ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 3 years ago
- Data Catalog for Databases and Data Warehouses☆31Updated 10 months ago
- List of entity resolution software and resources.☆38Updated 8 months ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 2 months ago
- A python library bakeoff for medium sized datasets☆24Updated last year
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB worker…☆18Updated 11 months ago
- A browser user interface for manual labeling of record pairs.☆41Updated last year
- ERPL is a DuckDB extension to integrate Enterprise Data in your Data Science and ML pipelines within minutes! ERPL connects DuckDB to SAP…☆30Updated 4 months ago
- Interactive notebooks containing demonstration code of the splink library☆37Updated 10 months ago
- Assessing whether data from database complies with reference information.☆42Updated this week
- A monorepo of many Rill example projects☆31Updated last week
- Playing with Python Bluesky SDK☆13Updated this week
- data wrangling simplicity, complete audit transparency, and at speed☆35Updated 2 months ago
- Graph Engine for Exploration and Search☆40Updated 9 months ago
- A small Python module containing quick utility functions for standard ETL processes.☆33Updated last week
- DuckDB for streaming data☆69Updated 7 months ago
- Modeling tool like DBT to use SQL Alchemy core with a DataFrame interface like☆11Updated last year
- ☆42Updated 3 weeks ago
- A serverless duckDB deployment at GCP☆35Updated 2 years ago
- An experimental Athena extension for DuckDB 🐤☆50Updated 9 months ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- Dask integration for Snowflake☆30Updated last week
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- The JUpyter-GRemlin Interface☆35Updated 5 years ago
- Demo converting streamlit uber nyc rides to use duckdb☆29Updated last year
- Apache Arrow PostgreSQL connector☆54Updated 9 months ago
- ☆80Updated last year