spotify / scioLinks
A Scala API for Apache Beam and Google Cloud Dataflow.
☆2,611Updated 2 weeks ago
Alternatives and similar repositories for scio
Users that are interested in scio are comparing it to the libraries listed below
Sorting:
- A Scala API for Cascading☆3,523Updated 2 years ago
- Abstract Algebra for Scala☆2,297Updated 2 months ago
- Base classes to use when writing tests with Spark☆1,545Updated last month
- Streaming MapReduce with Scalding and Storm☆2,129Updated 3 years ago
- A Scala feature transformation library for data science and machine learning☆469Updated 8 months ago
- Apache Spark to Apache Cassandra connector☆1,944Updated 5 months ago
- KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apach…☆1,182Updated 8 years ago
- A tool for data sampling, data generation, and data diffing☆344Updated 5 months ago
- Fast, testable, Scala services built on TwitterServer and Finagle☆2,270Updated 2 months ago
- Expressive types for Spark.☆889Updated last week
- A Scala kernel for Jupyter☆1,620Updated last week
- Protocol buffer compiler for Scala.☆1,328Updated this week
- Distributed Prometheus time series database☆1,456Updated last week
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆452Updated 2 months ago
- Essential Spark extensions and helper methods ✨😲☆764Updated last month
- Yet another JSON library for Scala☆2,521Updated this week
- A free tutorial for Apache Spark.☆992Updated 4 years ago
- JSON library☆1,487Updated this week
- Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.☆1,419Updated this week
- Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code☆296Updated 8 months ago
- command line options parsing for Scala☆1,445Updated last month
- Run in all nodes of your cluster before the cluster starts - lets you customize your cluster☆598Updated last week
- Breeze is/was a numerical processing library for Scala.☆3,455Updated 2 weeks ago
- "The path to execution", Styx is a service that schedules batch data processing jobs in Docker containers on Kubernetes.☆268Updated 2 years ago
- The Internals of Apache Spark☆1,521Updated 3 months ago
- 🔍 Elasticsearch Scala Client - Reactive, Non Blocking, Type Safe, HTTP Client☆1,637Updated this week
- Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.☆1,268Updated this week
- Wonderful reusable code from Twitter☆2,721Updated last month
- MLeap: Deploy ML Pipelines to Production☆1,527Updated 10 months ago
- Avro schema generation and serialization / deserialization for Scala☆727Updated this week