Sqooba / snorkel
Snorkel - Bootstrap your Data Science
☆23Updated 6 years ago
Related projects: ⓘ
- Utilities and examples to asssist in working with PySpark and Cassandra.☆36Updated 9 years ago
- machine learning playground☆12Updated 7 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 4 years ago
- functionstest☆33Updated 7 years ago
- Sample custom Nifi processor to process tcpdump☆18Updated 8 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated 10 months ago
- Open source analytics platform powered by Apache Cassandra, Spark, and Kafka☆34Updated 9 years ago
- Apache Spark under Docker☆9Updated 8 years ago
- ☆13Updated last year
- Automates Spark standalone cluster tasks with Puppet and Fabric.☆43Updated 10 years ago
- Groovy client library for Apache Ambari's REST API☆20Updated 3 years ago
- Sandbox for Apache nifi☆24Updated 2 years ago
- Apache NiFi Custom Processor for working with Stanford CoreNLP for Sentiment Analysis in Java 8☆11Updated 6 years ago
- Apache NiFi NLP Processor☆18Updated 10 months ago
- Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.☆111Updated 4 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆60Updated 2 weeks ago
- A Cascading Workflow Visualizer☆83Updated last year
- NiFi provenance reporting tasks☆14Updated last year
- An Exploration into Graph Databases☆28Updated 8 years ago
- A Docker Compose files to compose a NiFi cluster on Docker.☆35Updated 7 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 8 years ago
- Data Science Research Architecture, Data Center OS☆21Updated 8 years ago
- Reproducing Distributed Systems and Experiments on Cloud☆39Updated last year
- Docker image for apache zeppelin☆38Updated 7 years ago
- Apache NiFi Custom Processor Extracting Text From Files with Apache Tika☆34Updated last year
- An analysis of adverse drug event data using Hadoop, R, and Gephi☆44Updated 8 years ago
- something to help you spark☆65Updated 5 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆21Updated 8 years ago
- Telco traffic simulator built with Scala, Akka and Play☆14Updated last year