theShadow89 / nifi-bigquery-bundleLinks
Bigquery bundle for Apache NiFi
☆15Updated 6 years ago
Alternatives and similar repositories for nifi-bigquery-bundle
Users that are interested in nifi-bigquery-bundle are comparing it to the libraries listed below
Sorting:
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆91Updated last year
- ☆69Updated 4 years ago
- SQL data model for working with Snowplow web data. Supports Redshift and Looker. Snowflake and BigQuery coming soon☆60Updated 5 years ago
- Data ingestion library for Amundsen to build graph and search index☆204Updated last year
- An example Apache Beam project.☆111Updated 8 years ago
- Spark package for checking data quality☆222Updated 5 years ago
- Loads Snowplow enriched events into Google BigQuery☆23Updated 2 weeks ago
- Kinesis Connector for Structured Streaming☆138Updated last year
- Snowflake Data Source for Apache Spark.☆230Updated 3 weeks ago
- Spark connector for SFTP☆98Updated 2 years ago
- Front-end service library for Amundsen☆278Updated this week
- Examples for High Performance Spark☆16Updated 3 months ago
- File compaction tool that runs on top of the Spark framework.☆59Updated 6 years ago
- Google BigQuery support for Spark, SQL, and DataFrames☆156Updated 6 years ago
- Hive SerDe for CSV☆140Updated 4 years ago
- DEPRECATED. PLEASE USE https://github.com/confluentinc/kafka-connect-bigquery. A Kafka Connect BigQuery sink connector☆152Updated last year
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 6 years ago
- A rough prototype of a tool for discovering Apache Hive schemas from JSON documents.☆42Updated 2 years ago
- Multiple node presto cluster on docker container☆126Updated 3 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆186Updated 3 months ago
- Spark data source for Salesforce☆81Updated last year
- An Apache access log parser written in Scala☆73Updated 4 years ago
- Databricks Migration Tools☆43Updated 4 years ago
- HDF masterclass materials☆29Updated 9 years ago
- A simplified, lightweight ETL Framework based on Apache Spark☆589Updated 2 years ago
- A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR☆120Updated 9 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 9 years ago
- Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabyt…☆138Updated 3 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆161Updated 3 years ago
- JSON schema parser for Apache Spark☆82Updated 3 years ago