Google BigQuery support for Spark, SQL, and DataFrames
☆155Dec 14, 2019Updated 6 years ago
Alternatives and similar repositories for spark-bigquery
Users that are interested in spark-bigquery are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.☆70May 8, 2023Updated 2 years ago
- an example of integrating Spark Streaming with Google Pub/Sub and Google Datastore☆17Mar 22, 2017Updated 9 years ago
- A handy Scala wrapper of Google BigQuery API 's Java Client Library.☆34Sep 29, 2018Updated 7 years ago
- Hive Storage Handler for interoperability between BigQuery and Apache Hive☆19Jan 29, 2025Updated last year
- Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.☆289Updated this week
- A collection of Apache Parquet add-on modules☆30Mar 3, 2026Updated 2 weeks ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- Ephemeral Hadoop clusters using Google Compute Platform☆135Mar 31, 2022Updated 3 years ago
- A Scala API for Apache Beam and Google Cloud Dataflow.☆2,620Updated this week
- Google BigQuery data source for Apache Spark☆17Oct 1, 2024Updated last year
- Runs JVM closures in Docker containers on Kubernetes☆36Mar 23, 2018Updated 8 years ago
- Machine learning evaluation database☆24Feb 7, 2018Updated 8 years ago
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆422Mar 6, 2026Updated 2 weeks ago
- Base classes to use when writing tests with Spark☆1,549Dec 22, 2025Updated 3 months ago
- Interactive tools and developer experiences for Big Data on Google Cloud Platform.☆969Sep 2, 2022Updated 3 years ago
- ☆54Aug 3, 2017Updated 8 years ago
- Run in all nodes of your cluster before the cluster starts - lets you customize your cluster☆600Mar 6, 2026Updated 2 weeks ago
- Fluent Scala DSL for Google's Cloud Dataflow SDK☆56Aug 2, 2015Updated 10 years ago
- Metrics collection library for Google Dataflow☆13Nov 7, 2018Updated 7 years ago
- A Scala feature transformation library for data science and machine learning☆473Feb 7, 2025Updated last year
- DEPRECATED. PLEASE USE https://github.com/confluentinc/kafka-connect-bigquery. A Kafka Connect BigQuery sink connector☆151Mar 4, 2024Updated 2 years ago
- AMQP data source for dstream (Spark Streaming)☆26Mar 31, 2022Updated 3 years ago
- Compile-time tools for working with Avros in Scala☆55Dec 10, 2017Updated 8 years ago
- A connector for SingleStore and Spark☆162Sep 24, 2025Updated 5 months ago
- ☆14May 27, 2022Updated 3 years ago
- Scio IDEA plugin☆30Oct 2, 2025Updated 5 months ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 8 years ago
- ☆85Jan 26, 2026Updated last month
- Open source tools for Google Cloud Storage and Databases.☆63May 1, 2024Updated last year
- A tool for moving tables from Redshift to BigQuery☆65Jan 20, 2019Updated 7 years ago
- Spark pipelines that correspond to a series of Dataflow examples.☆27May 5, 2019Updated 6 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Oct 14, 2015Updated 10 years ago
- GCS support for avro-tools, parquet-tools and protobuf☆79May 5, 2025Updated 10 months ago
- [SUNSET] Async Google Pubsub Client☆158Mar 18, 2023Updated 3 years ago
- Opinion Analysis of News, Threaded Conversations, and User Generated Content☆108Sep 19, 2024Updated last year
- ☆14Oct 18, 2020Updated 5 years ago
- Scala bindings for Bokeh plotting library☆138Oct 11, 2023Updated 2 years ago
- Luigi integration for Google BigQuery☆15Nov 18, 2015Updated 10 years ago
- ☆10Feb 7, 2023Updated 3 years ago