intuit / thriveLinks
Thrive is an ETL framework that runs single-row transformations on HDFS data and makes the data available in relational databases (Hive and Vertica).
☆10Updated 8 years ago
Alternatives and similar repositories for thrive
Users that are interested in thrive are comparing it to the libraries listed below
Sorting:
- A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support☆260Updated 8 years ago
- VM based deployment for prototyping Big Data tools on Amazon Web Services☆129Updated 5 years ago
- ☆61Updated 7 years ago
- Observations from Ian on successfully delivering data science products☆543Updated 4 years ago
- A Python implementation of Douglas Hofstadter formal systems, from his book "Gödel, Escher, Bach"☆626Updated 4 years ago
- BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data its…☆939Updated 2 years ago
- This is a repo documenting the best practices in PySpark.☆464Updated 3 years ago
- Growing the code out of your notebooks - the right way.☆529Updated 3 years ago
- Content for architecting a data science platform for products using Luigi, Spark & Flask.☆163Updated 6 years ago
- Standard evaluations for binary classifiers so you don't have to☆315Updated 7 years ago
- Qualify sales leads with machine learning☆650Updated 8 years ago
- MacroBase: A Search Engine for Fast Data☆671Updated 3 years ago
- ☆263Updated 6 years ago
- systems is a set of tools for describing, running and visualizing systems diagrams.☆400Updated 8 months ago
- Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code☆296Updated last year
- Data Science in 30 minutes with Jess from Quantopian☆39Updated 9 years ago
- Helping students assess course difficulty and workload.☆36Updated 8 years ago
- A tool for data sampling, data generation, and data diffing☆344Updated last month
- A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.☆1,266Updated 7 years ago
- Jari's collection of interesting papers.☆496Updated 3 weeks ago
- Generates more or less realistic log data for testing simple aggregation queries.☆263Updated 2 years ago
- Information about the topics for each week of the course - published week by week.☆69Updated 9 years ago
- Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.☆28Updated 8 years ago
- Distributed decision tree ensemble learning in Scala☆390Updated 7 years ago
- A-Paper-A-Week☆955Updated last year
- Random implementation notes☆33Updated 12 years ago
- Ephemeral Hadoop clusters using Google Compute Platform☆134Updated 3 years ago
- Search service library for Amundsen☆54Updated 2 weeks ago
- Bloomberg Beta's Investment Documents for Series Seed, SAFEs, and Notes☆328Updated 2 months ago
- ☆116Updated 9 months ago