yu-iskw/spark-dataframe-introduction

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yu-iskw/spark-dataframe-introduction)

yu-iskw / spark-dataframe-introduction

This is an introduction of Apache Spark DataFrames.

☆41

Alternatives and similar repositories for spark-dataframe-introduction

Users that are interested in spark-dataframe-introduction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mengxr / spark-als
View on GitHub
Another, hopefully better, implementation of ALS on Spark
☆14May 20, 2015Updated 11 years ago
massie / spark-parquet-example
View on GitHub
Example project to show how to use Spark to read and write Avro/Parquet files
☆50Aug 21, 2013Updated 12 years ago
sigmoidanalytics / spark_gce
View on GitHub
Spark GCE Script Helps you deploy Spark cluster on Google Cloud.
☆43May 30, 2015Updated 11 years ago
caroljmcdonald / sparkdataframeexample
View on GitHub
☆21Oct 1, 2015Updated 10 years ago
ArchitectingHBase / examples
View on GitHub
Will come later...
☆20Jul 1, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lightning-viz / lightning-scala
View on GitHub
Scala client for the Lightning data visualization server (WIP)
☆47Jun 25, 2019Updated 7 years ago
bigdatagenomics / bdg-formats
View on GitHub
Open source formats for scalable genomic processing systems using Avro. Apache 2 licensed.
☆42Feb 13, 2026Updated 5 months ago
MrPowers / gill
View on GitHub
An example PySpark project with pytest
☆18Oct 13, 2017Updated 8 years ago
mkrcah / scala-kafka-twitter
View on GitHub
Example integration of Kafka, Avro & Spark-Streaming on live Twitter feed
☆22Jan 23, 2015Updated 11 years ago
webblearning / Neural-Attention-Model-For-Abstractive-Sentence-Summarization
View on GitHub
Tensorflow implementation of a Neural Attention Model for Abstractive Summarization.
☆10Jul 20, 2020Updated 6 years ago
freeman-lab / spark-ml-streaming
View on GitHub
Visualize streaming machine learning in Spark
☆176Jun 29, 2017Updated 9 years ago
yodle / griddle
View on GitHub
A gradle plugin that enables it to handle .thrift idl files and generate them with Thrift or Scrooge
☆13Jan 31, 2020Updated 6 years ago
potix2 / spark-google-spreadsheets
View on GitHub
Google Spreadsheets datasource for SparkSQL and DataFrames
☆58Jul 24, 2023Updated 3 years ago
google / cpython-pt
View on GitHub
Fork from python/cpython
☆12Dec 5, 2018Updated 7 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
sourcegraph / vcsstore
View on GitHub
vcsstore stores VCS repositories and makes them accessible via HTTP
☆19Jan 27, 2016Updated 10 years ago
lyveng / pandas-hbase
View on GitHub
Pandas Helper Library for reading and writing DataFrames from and to HBase.
☆10Mar 8, 2018Updated 8 years ago
gage-russell / pandas-lineage
View on GitHub
☆13Sep 19, 2022Updated 3 years ago
AliciaSchep / RecommendR
View on GitHub
Shiny App with R Package Recommendation System
☆11Sep 2, 2018Updated 7 years ago
frankscholten / mahout
View on GitHub
Mirror of Apache Mahout
☆15Mar 24, 2015Updated 11 years ago
Gschiavon / Kafka-SparkStreaming-HDFS
View on GitHub
☆14Nov 3, 2016Updated 9 years ago
mbonaci / spark-archetype-scala
View on GitHub
Maven archetype used to bootstrap a Spark Scala project
☆26Sep 1, 2015Updated 10 years ago
alchemyst / Segmentation
View on GitHub
Timeseries segmentation library
☆12Mar 8, 2023Updated 3 years ago
jlopezmalla / Flights
View on GitHub
scala and spark examples project
☆14Feb 19, 2018Updated 8 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
evojam / simple-nlp-search-dataset-generator
View on GitHub
Simple NLP Search - Dataset Generator
☆17Apr 29, 2016Updated 10 years ago
networm / progit2-zh
View on GitHub
☆10Jun 7, 2020Updated 6 years ago
RetailRocket / SparkMultiTool
View on GitHub
Tools for spark which we use on the daily basis
☆65Jul 2, 2020Updated 6 years ago
AtlasPilotPuppy / SparkAlgorithms
View on GitHub
Additional useful algorithms that can be used with spark.
☆24Dec 24, 2014Updated 11 years ago
chimpler / blog-scala-javacv
View on GitHub
☆13Nov 18, 2014Updated 11 years ago
picotrading / ansible-ulimit
View on GitHub
Role which helps to manage ulimit configuration
☆11Apr 27, 2015Updated 11 years ago
skrusche63 / spark-connect
View on GitHub
A subproject of Predictiveworks that provides common access to Cassandra, Elasticsearch, HBase, MongoDB, Parquet, JDBC database and other…
☆13Feb 23, 2015Updated 11 years ago
alexander-n-thomas / pydata-vocab-analysis
View on GitHub
This project is for the notebooks, code, and data for the "Vocabulary Analysis of Job Descriptions" tutorial at PyData 2017 Seattle
☆20Jul 12, 2017Updated 9 years ago
hardin47 / prediction2016
View on GitHub
Predictions for the 2016 ASA Election Prediction Contest
☆10Aug 25, 2016Updated 9 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
cloneofsimo / inversion_edits
View on GitHub
☆21Feb 9, 2023Updated 3 years ago
blachlylab / mucor
View on GitHub
☆12Feb 19, 2017Updated 9 years ago
tensor-programming / kotlin_api
View on GitHub
☆10Feb 3, 2018Updated 8 years ago
abajwa-hw / hdp-datascience-demo
View on GitHub
HDP Data Science/Machine Learning demo
☆37Aug 29, 2015Updated 10 years ago
xunzhang / paracel
View on GitHub
Distributed optimization framework with parameter server
☆23Jun 14, 2015Updated 11 years ago
randerzander / r-service
View on GitHub
Ambari Service definition for deploying R & RHadoop libraries
☆18Aug 3, 2015Updated 10 years ago
atbaker / intro-to-docker
View on GitHub
Links to all the source code and solutions I reference in my O'Reilly Introduction to Docker video tutorial
☆11Dec 10, 2014Updated 11 years ago