blackrock / xml_to_parquetLinks
Convert one or more XML files into Apache Parquet format. Only requires a XSD and XML file to get started.
☆37Updated 2 years ago
Alternatives and similar repositories for xml_to_parquet
Users that are interested in xml_to_parquet are comparing it to the libraries listed below
Sorting:
- Data Catalog for Databases and Data Warehouses☆35Updated last year
- A monorepo of many Rill example projects☆43Updated this week
- A tool to read CSV files with CSVW metadata and transform them into other formats.☆33Updated 6 years ago
- ☆34Updated 2 years ago
- Convert a CSV to a parquet file.☆64Updated 2 years ago
- This repo contains information about DuckDB extensions found on GitHub. Refreshed daily☆102Updated this week
- A small Python module containing quick utility functions for standard ETL processes.☆36Updated last month
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- ☆80Updated 2 years ago
- 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated 2 years ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆22Updated 4 years ago
- Python+VueJS application to load, explore, combine,transform and deliver data☆97Updated 7 months ago
- ☆90Updated last year
- CLI to create an ER Diagram from DuckDB database files☆135Updated 7 months ago
- An experimental Athena extension for DuckDB 🐤☆57Updated 9 months ago
- A maximum-strength name parser for record linkage.☆38Updated last month
- A python package to create a database on the platform using our moj data warehousing framework☆22Updated 3 months ago
- This project is wraper for Leilex, legal entity identifier API. Includes ISIN-LEI conversion. Search LEI number using company name.☆24Updated last year
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆63Updated last week
- A software engineering framework to jump start your machine learning projects☆37Updated last year
- Typed, annotated vectors for well-documented datasets☆11Updated 10 months ago
- ☆148Updated 6 months ago
- Generating Realistic Synthetic Data☆40Updated last year
- Notebooks which will provide a demo of Qgrid functionality☆20Updated 5 years ago
- Convenient pyarrow operations following the Pandas API☆45Updated 3 years ago
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆91Updated this week
- Data pipelines from re-usable components☆107Updated 2 years ago
- PyPi module for Graphlet AI Knowledge Graph Factory☆30Updated 2 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- DuckDB Community Extension to prompt LLMs from SQL☆51Updated 3 weeks ago