infochimps-labs / data_science_fun_packLinks
Meta-repository of big data tools -- source and essential plugins for hadoop, pig, wukong, storm, kafka etc.
☆29Updated 11 years ago
Alternatives and similar repositories for data_science_fun_pack
Users that are interested in data_science_fun_pack are comparing it to the libraries listed below
Sorting:
- Convert an XML input to a JSON output, using xml-mapping☆161Updated 9 years ago
- Code examples supporting the "Introduction to Apache Spark" video published by O'Reilly Media☆37Updated 3 years ago
- A complete environment for busy polyglot data scientists☆473Updated 4 years ago
- DEPRECATED A/B experiments service☆34Updated 2 months ago
- Code and setup information for Introduction to Machine Learning with Spark☆12Updated 10 years ago
- A (comprehensive) collection of open source tools used by the data community.☆52Updated 10 years ago
- Source, data and turotials of the blog post video series of Hue, the Web UI for Hadoop.☆235Updated 9 years ago
- PythonForDataScience☆156Updated 9 years ago
- My data is bigger than your data!☆39Updated last month
- Automatically exported from code.google.com/p/crush-tools☆150Updated 9 years ago
- Practical examples of using Apache Spark in several different use cases☆102Updated 9 years ago
- This page is a summary to keep the track of Hadoop related projects, and relevant projects around Big Data scene focused on the open sour…☆690Updated 4 years ago
- Data and example code for Programming Pig, by Alan F. Gates☆187Updated 9 years ago
- Machine learning and natural language processing with Apache Pig☆53Updated 12 years ago
- File format conversion tools☆292Updated 6 months ago
- https://www.kaylinpavlik.com/text-mining-south-park/☆173Updated 9 years ago
- Introduction to Big Data☆396Updated last year
- Coding exercises for Apache Spark☆104Updated 10 years ago
- Data-Intensive Text Processing with MapReduce☆627Updated 4 years ago
- A javascript shell for elasticsearch☆106Updated 10 years ago
- A Seriously Fun guide to Big Data Analytics in Practice☆169Updated 10 years ago
- Solutions to programming challenges and algorithmic problems☆55Updated 12 years ago
- Loan-level analysis of Fannie Mae and Freddie Mac data☆219Updated 5 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆111Updated 10 years ago
- Gallery of Apache Zeppelin notebooks☆216Updated 6 years ago
- Code for Tutorial on designing clickstream analytics application using Hadoop☆55Updated 10 years ago
- Mirror of Apache Blur☆33Updated 7 years ago
- A Cascading Workflow Visualizer☆83Updated 2 years ago
- ☆14Updated 4 years ago
- Tool for visual exploration of complex data.☆194Updated 7 years ago