Assembly of fundamental statistics implemented based on Apache Spark
☆31Feb 11, 2016Updated 10 years ago
Alternatives and similar repositories for StatisticsOnSpark
Users that are interested in StatisticsOnSpark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Spark MLlib code optimized to efficiently support sparse data☆51Dec 22, 2016Updated 9 years ago
- Topic Modeling on Apache Spark☆94Mar 1, 2019Updated 7 years ago
- Yelp Restaurant Photo Classification - Kaggle competition☆11Apr 19, 2019Updated 7 years ago
- Winter Break Collaboratory DS Boot Camp during the academic year of 2017-2018☆14Feb 12, 2018Updated 8 years ago
- Parallel ML System - Bosen Java implementation☆27Jan 23, 2017Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- ☆21Dec 9, 2015Updated 10 years ago
- Step-by-step Deep Leaning Tutorials on Apache Spark using BigDL☆210Jan 3, 2023Updated 3 years ago
- Gradient Boosting Enhanced with Step-Wise Feature Augmentation☆17Jan 13, 2021Updated 5 years ago
- Implementation of End-To-End Memory Networks with Tensorflow for bAbI Dataset☆11Aug 17, 2017Updated 8 years ago
- ☆14Aug 24, 2021Updated 4 years ago
- Another, hopefully better, implementation of ALS on Spark☆14May 20, 2015Updated 11 years ago
- Tensorflow implementation of a Neural Attention Model for Abstractive Summarization.☆10Jul 20, 2020Updated 5 years ago
- Visualizes the Random Forest debug string from the MLLib in Spark using D3.js☆37Sep 8, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Дипломная работа бакалавра / Bachelor thesis☆10Sep 11, 2015Updated 10 years ago
- ☆10Nov 15, 2015Updated 10 years ago
- Word2Vec - Google's word2vec in Scala using UMASS factorie library for better hacking and research.☆16Apr 7, 2014Updated 12 years ago
- Glint: High performance scala parameter server☆170Jul 20, 2018Updated 7 years ago
- Dropwizard Metrics reporter for Apache Spark☆28Dec 22, 2014Updated 11 years ago
- ☆62Jul 11, 2019Updated 6 years ago
- a spark custom window function example, to generate session IDs☆19Oct 26, 2017Updated 8 years ago
- Yahoo!'s topic modelling framework using Latent Dirichlet Allocation☆98Sep 21, 2011Updated 14 years ago
- Automatically exported from code.google.com/p/jbirch☆12Sep 6, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- JVM related exercises☆11Jul 16, 2017Updated 8 years ago
- Parallel ML System - STRADS scheduler☆30Oct 4, 2018Updated 7 years ago
- ☆20Nov 16, 2014Updated 11 years ago
- ExtJS component for drawing trees (actually Directed Acyclic Graphs)☆21Aug 16, 2012Updated 13 years ago
- Memory consumption estimator for Scala/Java☆27Nov 24, 2014Updated 11 years ago
- MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.☆14Apr 12, 2022Updated 4 years ago
- A focused web crawler based on Playwright, RMQ, Kafka and Flink.☆14Feb 4, 2021Updated 5 years ago
- A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0☆26Aug 5, 2021Updated 4 years ago
- 《Redis 命令速查表》☆14Nov 6, 2017Updated 8 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Data storytelling. See link for detailed documentations: http://lab41.github.io/gestalt.☆20Oct 16, 2016Updated 9 years ago
- Ansible playbooks to help to deploy Apache Hadoop,Spark,Storm,Zookeeper,Elasticsearch,Azkaban,Flume,Hbase,Kafka,Kibana,Logstash☆10Mar 21, 2017Updated 9 years ago
- Simple UDF to split JSON arrays into Hive arrays☆10Jun 24, 2016Updated 10 years ago
- All presentations from Data Fest Kyiv 2017 http://datafest.in.ua☆13Apr 24, 2017Updated 9 years ago
- An introduction to Hockey Visualization with D3.js☆15Mar 27, 2018Updated 8 years ago
- This project is for the notebooks, code, and data for the "Vocabulary Analysis of Job Descriptions" tutorial at PyData 2017 Seattle☆20Jul 12, 2017Updated 8 years ago
- ☆24Jul 2, 2015Updated 11 years ago