GoogleCloudDataproc / hive-bigquery-storage-handlerLinks
Hive Storage Handler for interoperability between BigQuery and Apache Hive
☆19Updated 4 months ago
Alternatives and similar repositories for hive-bigquery-storage-handler
Users that are interested in hive-bigquery-storage-handler are comparing it to the libraries listed below
Sorting:
- Cloud Spanner Connector for Apache Spark☆17Updated 5 months ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- Sample code with integration between Data Catalog and Hive data source.☆24Updated 4 months ago
- ☆81Updated last year
- Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.☆70Updated 2 years ago
- Stream Avro SpecificRecord objects in BigQuery using Cloud Dataflow☆13Updated 3 years ago
- Ephemeral Hadoop clusters using Google Compute Platform☆136Updated 3 years ago
- Spark stream from kafka(json) to s3(parquet)☆15Updated 6 years ago
- Spark pipelines that correspond to a series of Dataflow examples.☆27Updated 6 years ago
- Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.☆112Updated 5 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- ☆31Updated 6 years ago
- DBeam exports SQL tables into Avro files using JDBC and Apache Beam☆195Updated this week
- Spark cloud integration: tests, cloud committers and more☆19Updated 4 months ago
- A Giter8 template for scio☆31Updated 4 months ago
- ☆54Updated 7 years ago
- Google BigQuery support for Spark, SQL, and DataFrames☆155Updated 5 years ago
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆92Updated 10 months ago
- GCS support for avro-tools, parquet-tools and protobuf☆75Updated last month
- type-class based data cleansing library for Apache Spark SQL☆78Updated 6 years ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 7 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆51Updated last week
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- Build a real-time website analytics dashboard on GCP using Dataflow, Cloud Memorystore (Redis) and Spring Boot☆29Updated 3 months ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 4 years ago
- ☆26Updated 5 years ago
- [DEPRECATED] GAE python based app which regularly collects information about GCP resources and stores them in BigQuery☆45Updated last year
- A collection of Google Cloud Platform (GCP) plugins☆47Updated this week
- Wrangler Transform: A DMD system for transforming Big Data☆105Updated 2 weeks ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 9 months ago