maxmelnick / spark-graph-er
β15Updated 5 years ago
Alternatives and similar repositories for spark-graph-er:
Users that are interested in spark-graph-er are comparing it to the libraries listed below
- PySpark phonetic and string matching algorithmsβ39Updated 11 months ago
- High-performance data retrieval from Neo4j with Apache Arrow πΉβ31Updated 2 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examplesβ70Updated 4 years ago
- Record matching and entity resolution at scale in Sparkβ34Updated last year
- β33Updated 5 years ago
- Notebooks for the ML Link Prediction Courseβ14Updated 4 years ago
- Data validation library for PySpark 3.0.0β34Updated 2 years ago
- A collection of βcookbook-styleβ scripts for simplifying data engineering and machine learning in Apache Spark.β13Updated 3 years ago
- Spark functions to run popular phonetic and string matching algorithmsβ60Updated 2 years ago
- β15Updated 2 years ago
- Delta Lake helper methods. No Spark dependency.β22Updated 5 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.β28Updated this week
- A library that brings useful functions from various modern database management systems to Apache Sparkβ58Updated last year
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clientsβ36Updated 3 years ago
- β54Updated last year
- Code that was used as an example during the Data+AI Summit 2020β15Updated 3 years ago
- Instant search for and access to many datasets in Pyspark.β34Updated 2 years ago
- Binding the GDELT universe in a Spark environmentβ23Updated last year
- Spark and Delta Lake Workshopβ22Updated 2 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraformβ47Updated last month
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Sparkβ16Updated last year
- Jupyter notebooks showing how to use Neo4j Graph Algorithmsβ52Updated 4 years ago
- How to evaluate the Quality of your Data with Great Expectations and Spark.β29Updated last year
- Pandas helper functionsβ30Updated last year
- Code examples for the Introduction to Kubeflow courseβ14Updated 4 years ago
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"β36Updated 6 months ago
- The iterative broadcast join example code.β69Updated 7 years ago
- Simple machine learning in Python/Tensorflow with model savingβ14Updated 7 years ago
- Delta lake and filesystem helper methodsβ50Updated 11 months ago
- Fully unit tested utility functions for data engineering. Python 3 only.β15Updated 5 months ago