☆24Dec 21, 2020Updated 5 years ago
Alternatives and similar repositories for data-analysis-with-python-and-pyspark
Users that are interested in data-analysis-with-python-and-pyspark are comparing it to the libraries listed below
Sorting:
- Improving the development of Spark applications deployed as jobs on AWS services like Glue and EMR☆11Jul 26, 2023Updated 2 years ago
- Code repository for the "PySpark in Action" book☆214Jun 11, 2025Updated 8 months ago
- Optimization solvers in pure Python: LP, MILP, SAT, constraint programming, graph and metaheuristics. No dependencies. Solvor all your op…☆25Feb 1, 2026Updated last month
- Chromax is a breeding simulator based on JAX.☆10Jun 6, 2025Updated 8 months ago
- ☆11Updated this week
- Python library & CLI to create, view and edit PFB files☆12Feb 19, 2026Updated last week
- Using near-infrared spectroscopy (NIRS) and machine learning to determine oleic acid content from peanut raw grains☆12Apr 6, 2022Updated 3 years ago
- Files to Build a Docker Image for Facebook Prophet☆13Feb 7, 2019Updated 7 years ago
- GEFormer is a genome-wide prediction model for genotype-environment interactions based on a deep learning approach designed to predict ma…☆14Jan 15, 2026Updated last month
- ☆12Jan 25, 2018Updated 8 years ago
- ☆11Jun 15, 2019Updated 6 years ago
- Transform natural language into beautiful, interactive data visualizations using the Model Context Protocol (MCP) with Claude Desktop int…☆16Jun 27, 2025Updated 8 months ago
- This repository contains example patterns for storing large objects with DynamoDB.☆13Jun 19, 2024Updated last year
- Integrative protein sequence design with evolutionary multiobjective optimization.☆12Jul 16, 2024Updated last year
- SQL☆21Jul 15, 2017Updated 8 years ago
- IBGE - Censo 2010 - Localização e respectivo Código de Setor Censitário☆10Apr 3, 2021Updated 4 years ago
- An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.☆13Oct 15, 2020Updated 5 years ago
- The Genomics Tertiary Analysis and Machine Learning Using Amazon SageMaker solution creates a scalable environment in AWS to develop mach…☆11Jul 7, 2023Updated 2 years ago
- Conteúdo das aulas da turma 6 do bootcamp de engenharia de dados da How☆12Sep 16, 2021Updated 4 years ago
- ☆12Jun 27, 2024Updated last year
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Nov 18, 2025Updated 3 months ago
- ☆14Feb 20, 2023Updated 3 years ago
- This Guidance helps customers set up an ecommerce website on WordPress.☆11Oct 19, 2024Updated last year
- ☆15Mar 24, 2025Updated 11 months ago
- A tool for phasing and imputing haplotypes in 10k+ low coverage sequencing samples☆10Nov 20, 2020Updated 5 years ago
- ☆11Mar 4, 2025Updated 11 months ago
- DeepVariant-on-Spark is a germline short variant calling pipeline that runs Google DeepVariant on Apache Spark at scale.☆12May 4, 2022Updated 3 years ago
- This is the pipeline of our new article "Enzyme Co-Scientist: Harnessing Large Language Models for Enzyme Kinetic Data Extraction from Li…☆16May 23, 2025Updated 9 months ago
- Explore integration between Watson Studio and Cognos Analytics☆13Jul 22, 2020Updated 5 years ago
- Code necessary to reproduce experiments in "FloraBERT: cross-species transfer learning with attention-based neural networks for gene expr…☆13Jul 6, 2022Updated 3 years ago
- Accessibility-ready business WordPress theme.☆15Sep 3, 2025Updated 5 months ago
- LUMIN: Your data analysis companion that turns natural language questions into powerful insights through AI-driven visualizations and cle…☆15Nov 11, 2024Updated last year
- Lambda serverless workshop☆13Aug 23, 2018Updated 7 years ago
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark☆11May 22, 2018Updated 7 years ago
- Google Data Studio connector example code☆11Nov 26, 2018Updated 7 years ago
- It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged fo…☆13Jun 12, 2021Updated 4 years ago
- A simple python SDK around PubMed API.☆21Jan 1, 2025Updated last year
- Course included such topics, as Data Preprocessing, Exploratory Data Analysis (EDA), Statistical Data Analysis (SDA), Data Collection an…☆12Aug 8, 2022Updated 3 years ago
- Ascertained Sequentially Markovian Coalescent☆16Oct 22, 2025Updated 4 months ago