Google BigQuery support for Spark, SQL, and DataFrames
☆156Dec 14, 2019Updated 6 years ago
Alternatives and similar repositories for spark-bigquery
Users that are interested in spark-bigquery are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.☆70May 8, 2023Updated 3 years ago
- an example of integrating Spark Streaming with Google Pub/Sub and Google Datastore☆16Mar 22, 2017Updated 9 years ago
- A handy Scala wrapper of Google BigQuery API 's Java Client Library.☆34Sep 29, 2018Updated 7 years ago
- Hive Storage Handler for interoperability between BigQuery and Apache Hive☆19Jan 29, 2025Updated last year
- Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.☆292Jun 24, 2026Updated last week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A collection of Apache Parquet add-on modules☆30Jun 14, 2026Updated 2 weeks ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆145Jan 26, 2016Updated 10 years ago
- Ephemeral Hadoop clusters using Google Compute Platform☆136Mar 31, 2022Updated 4 years ago
- A Scala API for Apache Beam and Google Cloud Dataflow.☆2,627Jun 25, 2026Updated last week
- Google BigQuery data source for Apache Spark☆17Oct 1, 2024Updated last year
- A tool for data sampling, data generation, and data diffing☆349Mar 31, 2026Updated 3 months ago
- Shaded version of Apache Hadoop 2.x for Presto☆16Sep 16, 2025Updated 9 months ago
- Runs JVM closures in Docker containers on Kubernetes☆38Mar 23, 2018Updated 8 years ago
- Machine learning evaluation database☆24Feb 7, 2018Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆423Jun 25, 2026Updated last week
- Base classes to use when writing tests with Spark☆1,551Apr 20, 2026Updated 2 months ago
- Interactive tools and developer experiences for Big Data on Google Cloud Platform.☆973Sep 2, 2022Updated 3 years ago
- ☆54Aug 3, 2017Updated 8 years ago
- Run in all nodes of your cluster before the cluster starts - lets you customize your cluster☆598Jun 17, 2026Updated 2 weeks ago
- Repository with examples and smoke tests for the GCP Airflow operators and hooks☆152Jan 15, 2017Updated 9 years ago
- Fluent Scala DSL for Google's Cloud Dataflow SDK☆56Aug 2, 2015Updated 10 years ago
- Metrics collection library for Google Dataflow☆13Nov 7, 2018Updated 7 years ago
- DEPRECATED. PLEASE USE https://github.com/confluentinc/kafka-connect-bigquery. A Kafka Connect BigQuery sink connector☆151Mar 4, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A Scala feature transformation library for data science and machine learning☆475Feb 7, 2025Updated last year
- AMQP data source for dstream (Spark Streaming)☆26Mar 31, 2022Updated 4 years ago
- Compile-time tools for working with Avros in Scala☆55Dec 10, 2017Updated 8 years ago
- A connector for SingleStore and Spark☆164Jun 4, 2026Updated last month
- ☆14May 27, 2022Updated 4 years ago
- Scio IDEA plugin☆30Oct 2, 2025Updated 9 months ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆473Apr 18, 2017Updated 9 years ago
- ☆84Jan 26, 2026Updated 5 months ago
- A tool for moving tables from Redshift to BigQuery☆65Jan 20, 2019Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Open source tools for Google Cloud Storage and Databases.☆65May 1, 2024Updated 2 years ago
- Spark pipelines that correspond to a series of Dataflow examples.☆27May 5, 2019Updated 7 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Oct 14, 2015Updated 10 years ago
- GCS support for avro-tools, parquet-tools and protobuf☆79May 5, 2025Updated last year
- ☆14Oct 18, 2020Updated 5 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Jun 1, 2015Updated 11 years ago
- Opinion Analysis of News, Threaded Conversations, and User Generated Content☆110Sep 19, 2024Updated last year