Sqooba / snorkel
Snorkel - Bootstrap your Data Science
☆24Updated 7 years ago
Alternatives and similar repositories for snorkel
Users that are interested in snorkel are comparing it to the libraries listed below
Sorting:
- machine learning playground☆12Updated 8 years ago
- Scala port of the word2vec toolkit.☆11Updated 8 years ago
- Data pipeline automation tool☆26Updated last year
- A java library for stored queries☆16Updated last year
- ☆13Updated last year
- Common components used across the datamountaineer kafka connect connectors☆21Updated 4 years ago
- Sample custom Nifi processor to process tcpdump☆18Updated 9 years ago
- Examples for Fast Data Processing with Spark☆59Updated 11 years ago
- A collection of datasets and databases☆24Updated 7 years ago
- Examples of user defined functions for Apache Drill☆18Updated 7 years ago
- Fusion demo app searching open-source project data from the Apache Software Foundation☆42Updated 6 years ago
- Simple way to copy data from relational databases into kafka.☆20Updated 7 years ago
- A python tool to manage developing and testing with lots of microservices☆59Updated last year
- Apache Spark under Docker☆9Updated 9 years ago
- A Cascading Workflow Visualizer☆83Updated 2 years ago
- Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.☆111Updated 5 years ago
- Apache NiFi Custom Processor for working with Stanford CoreNLP for Sentiment Analysis in Java 8☆11Updated 6 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 5 years ago
- An analysis of adverse drug event data using Hadoop, R, and Gephi☆44Updated 9 years ago
- Twitter Streaming API Example with Kafka Streams in Scala☆49Updated 8 years ago
- ☆23Updated 7 years ago
- Groovy client library for Apache Ambari's REST API☆20Updated 3 years ago
- Dependency and data pipeline management framework for Spark and Scala☆15Updated 8 years ago
- Sandbox for Apache nifi☆24Updated 3 years ago
- InsightEdge Core☆20Updated last year
- Automates Spark standalone cluster tasks with Puppet and Fabric.☆43Updated 10 years ago
- functionstest☆33Updated 8 years ago
- A project that implements statistical methods for identifying anomalous files☆22Updated 10 years ago
- Scriptable scheduler for periodical Hadoop workflows☆22Updated 7 years ago
- Preliminary Solr DQ / Data Quality experiments and prototype, and SolrJ wrapper utilities☆26Updated 3 months ago