mark-hoffmann / fastteradata
Tools for faster and optimized interaction with Teradata and large datasets.
☆17Updated 6 years ago
Alternatives and similar repositories for fastteradata:
Users that are interested in fastteradata are comparing it to the libraries listed below
- A simple introduction to using spark ml pipelines☆26Updated 6 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆52Updated 6 years ago
- Asynchronous actions for PySpark☆47Updated 3 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- PySpark phonetic and string matching algorithms☆39Updated last year
- Helpers & syntactic sugar for PySpark.☆61Updated last year
- Simple Spark example of generating table stats for use of data quality checks☆28Updated 7 years ago
- Spark ML Lib serving library☆48Updated 6 years ago
- An example PySpark project with pytest☆17Updated 7 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Functional Airflow DAG definitions.☆38Updated 7 years ago
- A tool and library for easily deploying applications on Apache YARN☆143Updated last year
- A toolset to streamline running spark python on EMR☆20Updated 8 years ago
- Apache (Py)Spark type annotations (stub files).☆116Updated 2 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆70Updated 4 years ago
- Scala SDK for working with Snowplow enriched events in Spark, AWS Lambda, Flink et al.☆20Updated 4 months ago
- Set of iPython and Jupyter extensions to improve user experience☆50Updated 5 years ago
- An extension for Jupyter Lab & Jupyter Notebook to monitor Apache Spark (pyspark) from notebooks☆50Updated last year
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Updated 8 years ago
- A K8s-based infrastructure for analytics☆24Updated 5 years ago
- Utilities to work with Scala/Java code with py4j☆40Updated last year
- ☆106Updated 2 years ago
- big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.☆65Updated 4 years ago
- Python - Java/Scala API for the Hopsworks feature store☆54Updated this week
- Featureselection methods as Spark MLlib Pipelines☆30Updated 6 years ago
- ☆37Updated 5 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆82Updated 5 years ago
- Deploy dask on YARN clusters☆69Updated 7 months ago
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆44Updated 5 years ago