blackrock / xml_to_parquet
Convert one or more XML files into Apache Parquet format. Only requires a XSD and XML file to get started.
☆33Updated 2 years ago
Alternatives and similar repositories for xml_to_parquet:
Users that are interested in xml_to_parquet are comparing it to the libraries listed below
- A small Python module containing quick utility functions for standard ETL processes.☆34Updated last week
- List of entity resolution software and resources.☆50Updated 10 months ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆27Updated 2 years ago
- dagster scikit-learn pipeline example.☆44Updated last year
- @vega transforms with @ibis-project expressions☆29Updated 3 years ago
- A tool to read CSV files with CSVW metadata and transform them into other formats.☆32Updated 5 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated 10 months ago
- An experimental Athena extension for DuckDB 🐤☆51Updated 3 weeks ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆21Updated 3 years ago
- Modeling tool like DBT to use SQL Alchemy core with a DataFrame interface like☆11Updated last year
- API Framework heavily relying on the power of DuckDB and DuckDB extensions. Ready to build performant and cost-efficient APIs on top of B…☆22Updated this week
- A monorepo of many Rill example projects☆32Updated this week
- Notebooks which will provide a demo of Qgrid functionality☆20Updated 5 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 4 months ago
- An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks☆21Updated 2 years ago
- A secure way of storing credentials within JupyterLab☆21Updated 4 years ago
- Data Catalog for Databases and Data Warehouses☆31Updated last year
- Ibis analytics, with Ibis (and more!)☆20Updated 3 months ago
- A browser user interface for manual labeling of record pairs.☆42Updated last year
- A Jupyter kernel for ClickHouse☆24Updated 4 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- pyjstat is a python library for JSON-stat formatted data manipulation which allows reading and writing JSON-stat [1] format with python,u…☆30Updated last year
- CLI for creating databases for Data Quality Dashboards.☆19Updated 5 years ago
- Graph Engine for Exploration and Search☆40Updated 11 months ago
- ERPL is a DuckDB extension to integrate Enterprise Data in your Data Science and ML pipelines within minutes! ERPL connects DuckDB to SAP…☆32Updated 6 months ago
- Using the Parquet file format with Python☆15Updated last year
- Neural Solr = Solr 9 + Mighty Inference + Node☆16Updated 2 years ago
- Glue is an enterprise data model for the buy side, tailored for Wealth and Asset Managers and covering key entities such as Party, Busine…☆21Updated last year
- Benchmark study on KùzuDB, an embedded OLAP graph database, on an artificial social network dataset☆32Updated last month
- The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.☆20Updated last year