VorTECHsa / refineryLinks
Refinery is a tool to extract and transform semi-structured data from Excel spreadsheets of different layouts in a declarative way.
☆54Updated 5 months ago
Alternatives and similar repositories for refinery
Users that are interested in refinery are comparing it to the libraries listed below
Sorting:
- Infinitic is an open source orchestration framework for application teams to build durable and flexible backend processes.☆356Updated 3 months ago
- JSON Schema to Avro Mapper☆28Updated last year
- Data pipelines from re-usable components☆107Updated 2 months ago
- Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful …☆144Updated last year
- A Python framework for data processing on GCP.☆120Updated 10 months ago
- sgr (command line client for Splitgraph) and the splitgraph Python library☆323Updated last year
- Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.☆50Updated 3 years ago
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆268Updated 10 months ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆25Updated last year
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated 2 years ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis.☆110Updated 9 months ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆127Updated 4 years ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆94Updated 3 years ago
- ODD Specification is a universal open standard for collecting metadata.☆146Updated last year
- Deephaven Community Core☆337Updated this week
- The home of the Prefect 1 UI☆183Updated 5 months ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆60Updated 3 years ago
- Arc is an opinionated framework for defining data pipelines which are predictable, repeatable and manageable.☆171Updated 2 years ago
- DB API 2 interface for Flight SQL with SQLAlchemy extras.☆43Updated 4 months ago
- Build and deploy a serverless data pipeline on AWS with no effort.☆110Updated 3 years ago
- Aiven's S3 Sink Connector for Apache Kafka®☆71Updated last year
- Dremio Flight connector. Access Dremio using Arrow flight☆39Updated 5 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆97Updated this week
- A curated list to help you manage temporal data across many modalities 🚀.☆118Updated 3 years ago
- Transporter for integrating OpenLineage with OpenMetadata☆15Updated 5 months ago
- Open source data observability platform☆329Updated 3 years ago
- Metadata tracking and UI service for Metaflow!☆218Updated last month
- Easily sync your Postgres database to a Snowflake, ClickHouse, or DuckDB warehouse.☆84Updated last year
- Amundsen Gremlin☆22Updated 3 years ago
- 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated 2 years ago