Google BigQuery support for Spark, SQL, and DataFrames
☆156Dec 14, 2019Updated 6 years ago
Alternatives and similar repositories for spark-bigquery
Users that are interested in spark-bigquery are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.☆70May 8, 2023Updated 2 years ago
- an example of integrating Spark Streaming with Google Pub/Sub and Google Datastore☆16Mar 22, 2017Updated 9 years ago
- Hive Storage Handler for interoperability between BigQuery and Apache Hive☆19Jan 29, 2025Updated last year
- Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.☆289Updated this week
- A collection of Apache Parquet add-on modules☆30Apr 15, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- Ephemeral Hadoop clusters using Google Compute Platform☆136Mar 31, 2022Updated 4 years ago
- A Scala API for Apache Beam and Google Cloud Dataflow.☆2,623Apr 13, 2026Updated 2 weeks ago
- A tool for data sampling, data generation, and data diffing☆346Mar 31, 2026Updated last month
- Shaded version of Apache Hadoop 2.x for Presto☆16Sep 16, 2025Updated 7 months ago
- Runs JVM closures in Docker containers on Kubernetes☆36Mar 23, 2018Updated 8 years ago
- Machine learning evaluation database☆24Feb 7, 2018Updated 8 years ago
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆422Updated this week
- Base classes to use when writing tests with Spark☆1,553Apr 20, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Interactive tools and developer experiences for Big Data on Google Cloud Platform.☆970Sep 2, 2022Updated 3 years ago
- ☆54Aug 3, 2017Updated 8 years ago
- Run in all nodes of your cluster before the cluster starts - lets you customize your cluster☆597Apr 24, 2026Updated last week
- Repository with examples and smoke tests for the GCP Airflow operators and hooks☆152Jan 15, 2017Updated 9 years ago
- Metrics collection library for Google Dataflow☆13Nov 7, 2018Updated 7 years ago
- A Scala feature transformation library for data science and machine learning☆474Feb 7, 2025Updated last year
- DEPRECATED. PLEASE USE https://github.com/confluentinc/kafka-connect-bigquery. A Kafka Connect BigQuery sink connector☆152Mar 4, 2024Updated 2 years ago
- AMQP data source for dstream (Spark Streaming)☆26Mar 31, 2022Updated 4 years ago
- A connector for SingleStore and Spark☆162Apr 17, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Scio IDEA plugin☆30Oct 2, 2025Updated 7 months ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 9 years ago
- ☆84Jan 26, 2026Updated 3 months ago
- A tool for moving tables from Redshift to BigQuery☆65Jan 20, 2019Updated 7 years ago
- Open source tools for Google Cloud Storage and Databases.☆64May 1, 2024Updated 2 years ago
- Spark pipelines that correspond to a series of Dataflow examples.☆27May 5, 2019Updated 6 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Oct 14, 2015Updated 10 years ago
- [SUNSET] Async Google Pubsub Client☆159Mar 18, 2023Updated 3 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Jun 1, 2015Updated 10 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆14Oct 18, 2020Updated 5 years ago
- Scala bindings for Bokeh plotting library☆138Oct 11, 2023Updated 2 years ago
- Luigi integration for Google BigQuery☆15Nov 18, 2015Updated 10 years ago
- ☆10Feb 7, 2023Updated 3 years ago
- functionstest☆33Oct 25, 2016Updated 9 years ago
- A scala dsl for dataflow☆11Dec 31, 2014Updated 11 years ago
- Tool to convert & load data from edX platform into BigQuery☆29Dec 1, 2023Updated 2 years ago