Sqooba / snorkel
Snorkel - Bootstrap your Data Science
☆24Updated 7 years ago
Alternatives and similar repositories for snorkel:
Users that are interested in snorkel are comparing it to the libraries listed below
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆95Updated 5 years ago
- Examples of user defined functions for Apache Drill☆18Updated 7 years ago
- Sandbox for Apache nifi☆24Updated 3 years ago
- Apache Spark under Docker☆9Updated 8 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Data pipeline automation tool☆26Updated last year
- InsightEdge Core☆20Updated last year
- Sample custom Nifi processor to process tcpdump☆18Updated 9 years ago
- ☆13Updated last year
- Avro Schema Shredder is a REST API that enables storage of Avro Schemas in Apache Atlas. This API enables an organization to use Apache A…☆13Updated 8 years ago
- Scala port of the word2vec toolkit.☆11Updated 8 years ago
- A collection of datasets and databases☆24Updated 6 years ago
- phData Pulse application log aggregation and monitoring☆13Updated 4 years ago
- Utilities and examples to asssist in working with PySpark and Cassandra.☆36Updated 10 years ago
- Vagrant, Apache Spark and Apache Zeppelin VM for teaching☆44Updated 7 years ago
- Automates Spark standalone cluster tasks with Puppet and Fabric.☆43Updated 10 years ago
- Common components used across the datamountaineer kafka connect connectors☆21Updated 4 years ago
- Quickly analyze and explore email with advanced analytics and visualization.☆56Updated 3 years ago
- Provides a Pythonic interface for reading and writing Avro schemas☆27Updated 2 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- Burglary prediction for mortals☆10Updated 10 months ago
- Docker image for apache zeppelin☆38Updated 7 years ago
- Provided Guidance on Creating End to End Solutions for Common SILK Use Cases☆13Updated 9 years ago
- A library to store metadata of relational databases including the schema, statistics, and integrity constraints.☆25Updated 6 years ago
- ☆15Updated 7 years ago
- Library for building reproducible data pipelines to support experimentation☆20Updated 9 years ago
- functionstest☆33Updated 8 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 9 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 7 years ago
- Apache NiFi Custom Processor for working with Stanford CoreNLP for Sentiment Analysis in Java 8☆11Updated 6 years ago