blackrock / xml_to_parquet
Convert one or more XML files into Apache Parquet format. Only requires a XSD and XML file to get started.
☆32Updated 2 years ago
Alternatives and similar repositories for xml_to_parquet:
Users that are interested in xml_to_parquet are comparing it to the libraries listed below
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆27Updated 2 years ago
- Data Catalog for Databases and Data Warehouses☆32Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated 11 months ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆19Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆44Updated last year
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB worker…☆18Updated last year
- List of entity resolution software and resources.☆56Updated 11 months ago
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- A small Python module containing quick utility functions for standard ETL processes.☆34Updated this week
- Graph Engine for Exploration and Search☆40Updated last year
- Benchmark study on Kùzu, an embedded OLAP graph database, on an artificial social network dataset☆31Updated 2 months ago
- KnowledgeRepo + JupyterLab☆48Updated 3 months ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- This repo contains information about DuckDB extensions found on GitHub. Refreshed daily☆92Updated this week
- A python library bakeoff for medium sized datasets☆24Updated last year
- ☆44Updated 6 months ago
- A monorepo of many Rill example projects☆34Updated this week
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆21Updated 3 years ago
- Playground for Neo4j Graph Algorithms☆30Updated last year
- This connector is a dbt project that maps Medicare CCLF claims data to the Tuva Input Layer.☆13Updated 2 months ago
- quadipy is a python package to help transform structured data into RDF graph format☆19Updated last year
- @vega transforms with @ibis-project expressions☆29Updated 3 years ago
- Using the Parquet file format with Python☆15Updated last year
- Ibis analytics, with Ibis (and more!)☆20Updated 4 months ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- ☆46Updated last month
- A tool to read CSV files with CSVW metadata and transform them into other formats.☆32Updated 5 years ago
- A collection of python utility functions☆11Updated 7 months ago
- pyjstat is a python library for JSON-stat formatted data manipulation which allows reading and writing JSON-stat [1] format with python,u…☆29Updated last year