tresata/spark-scalding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tresata/spark-scalding)

tresata / spark-scalding

Use Cascading Taps and Scalding DSL with Spark

☆49

Alternatives and similar repositories for spark-scalding

Users that are interested in spark-scalding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Verizon / mutatis
View on GitHub
Kafka consumer & producer for scalaz-stream
☆12Dec 15, 2017Updated 8 years ago
bmc / argot
View on GitHub
A command-line parser for Scala
☆65Nov 15, 2019Updated 6 years ago
Cascading / cascading-hive
View on GitHub
Integration for Cascading and Apache Hive
☆25Oct 31, 2017Updated 8 years ago
calrissian / spark-jetty-server
View on GitHub
Recipes and examples for Apache Spark
☆13Jan 21, 2015Updated 11 years ago
memsql / streamliner-starter
View on GitHub
Starter project for building MemSQL Streamliner Pipelines
☆32Apr 18, 2017Updated 9 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ParallelAI / SpyGlass
View on GitHub
Cascading and Scalding wrapper for HBase with advanced read features
☆54Updated this week
scalding-io / social-media-analytics
View on GitHub
Social Media Data Mining and Analytics - HyperLogLog, BloomFilter and CountMinSketch with Scalding & Algebird
☆27Oct 6, 2018Updated 7 years ago
anilmuppalla / hpdc-scalding-spark
View on GitHub
Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark
☆15Oct 6, 2017Updated 8 years ago
tresata / spark-columnar
View on GitHub
☆15Mar 4, 2015Updated 11 years ago
scalding-io / ProgrammingWithScalding
View on GitHub
Programming MapReduce with Scalding
☆82Dec 5, 2015Updated 10 years ago
dataArtisans / cascading-flink
View on GitHub
Cascading on Apache Flink®
☆54Feb 5, 2024Updated 2 years ago
davideanastasia / twitter-realtime-sentiment
View on GitHub
Spark/Cassandra/Akka combo to visualize a cloud of words using d3.js
☆11Dec 6, 2015Updated 10 years ago
collectivemedia / modelmatrix
View on GitHub
Sparse feature extraction with Spark
☆30Jul 25, 2018Updated 8 years ago
Cascading / fluid
View on GitHub
A Fluent Java API for Cascading
☆22Jun 14, 2017Updated 9 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
fdietze / sabuni
View on GitHub
Light Colorscheme for IntelliJ IDEA
☆14Jul 28, 2023Updated 3 years ago
karlhigley / lexrank-summarizer
View on GitHub
A Spark-based LexRank extractive summarizer for text documents
☆19Dec 23, 2015Updated 10 years ago
elodina / zipkin-mesos-framework
View on GitHub
Zipkin Mesos Framework
☆31Feb 24, 2016Updated 10 years ago
googlegenomics / spark-examples
View on GitHub
Apache Spark jobs such as Principal Coordinate Analysis.
☆77Jan 30, 2017Updated 9 years ago
adobe-research / spindle
View on GitHub
Next-generation web analytics processing with Scala, Spark, and Parquet.
☆330Mar 28, 2015Updated 11 years ago
tresata / ganitha
View on GitHub
scalding powered machine learning
☆109Nov 18, 2014Updated 11 years ago
iheartradio / asobu
View on GitHub
Asobu (遊ぶ) Library for building distributed REST APIs for microservices based on akka cluster and play
☆12Oct 20, 2016Updated 9 years ago
skrusche63 / spark-fsm
View on GitHub
This project provides sequential pattern mining for Apache Spark. The algorithms are based on the work of Philippe Fournier-Viger and co…
☆29Mar 12, 2015Updated 11 years ago
film42 / forecast-io-scala
View on GitHub
Forecast IO v2 api wrapper for Scala
☆12Nov 23, 2015Updated 10 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
geotrellis / geotrellis-spray-tutorial-deprecated
View on GitHub
OLD VERSION OF GEOTRELLIS: A sample GIS service built using GeoTrellis and Spray
☆15Sep 30, 2016Updated 9 years ago
harvard-library / librarycloud
View on GitHub
Harvard University Library Cloud API
☆11Feb 25, 2022Updated 4 years ago
ThinkBigAnalytics / scalding-workshop
View on GitHub
A half-day workshop on Scalding, the Scala API for Cascading
☆48Mar 21, 2016Updated 10 years ago
massie / spark-parquet-example
View on GitHub
Example project to show how to use Spark to read and write Avro/Parquet files
☆50Aug 21, 2013Updated 12 years ago
hammerlab / spark-util
View on GitHub
low-level helpers for Apache Spark libraries and tests
☆16Dec 29, 2018Updated 7 years ago
ThoughtWorksInc / tryt.scala
View on GitHub
Monad transformers for exception handling
☆17Aug 19, 2024Updated last year
openankus / ankus
View on GitHub
Numeric / Norminal Statistics, Certainty Factor, Normalize, ETL, TF-IDF, Discretization on Hadoop MapReduce
☆11Jun 28, 2016Updated 10 years ago
pico-works / pico-event
View on GitHub
Tiny publish subscribe library
☆15Mar 27, 2017Updated 9 years ago
tresata / spark-kafka
View on GitHub
Low level integration of Spark and Kafka
☆129Mar 15, 2018Updated 8 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cloudify / scalazon
View on GitHub
Idiomatic, opinionated Scala library for AWS
☆30May 13, 2017Updated 9 years ago
rtyley / scala-git
View on GitHub
small Scala veneer over JGit
☆21Sep 17, 2025Updated 10 months ago
VeritoneAlpha / jaws-spark-sql-rest
View on GitHub
☆91Apr 17, 2017Updated 9 years ago
joshlemer / MultiIndex
View on GitHub
A Scala Collection for Multiple Access Patterns
☆12Oct 22, 2016Updated 9 years ago
Cascading / lingual
View on GitHub
Stand-alone ANSI SQL for Cascading on Apache Hadoop
☆48Jan 25, 2018Updated 8 years ago
cloudera-labs / SparkOnHBase
View on GitHub
SparkOnHBase
☆278Mar 30, 2021Updated 5 years ago
big-data-research / in-memory-data-pipeline
View on GitHub
The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.
☆10Jun 1, 2015Updated 11 years ago