PySpark phonetic and string matching algorithms
☆41Feb 19, 2024Updated 2 years ago
Alternatives and similar repositories for ceja
Users that are interested in ceja are comparing it to the libraries listed below
Sorting:
- Spark functions to run popular phonetic and string matching algorithms☆59Feb 22, 2022Updated 4 years ago
- Pandas helper functions☆31Feb 19, 2023Updated 3 years ago
- Dockerfile for Apache Zeppelin☆17Dec 9, 2015Updated 10 years ago
- spark structured streaming via HTTP communication☆18Jul 7, 2022Updated 3 years ago
- Delta lake and filesystem helper methods☆50Feb 29, 2024Updated 2 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆683Mar 6, 2025Updated 11 months ago
- ☆11Oct 6, 2023Updated 2 years ago
- An open relation extraction system☆47Nov 23, 2021Updated 4 years ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆36Dec 15, 2024Updated last year
- Scripts for Azure Synapse SQL Pools (Provisioned) and Query-on-Demand (Serverless)☆11Nov 2, 2021Updated 4 years ago
- Использование инструмента Draw.io для создания схем Terraform развертываний.☆10Dec 18, 2025Updated 2 months ago
- Automated Continuous Data Quality Measurement☆12Nov 15, 2023Updated 2 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆45Jan 27, 2024Updated 2 years ago
- Воркшоп «Agile Mindset в проектировании информационных и производственных систем» 32hrs☆13Nov 3, 2023Updated 2 years ago
- This repository contains NiFi processors for interacting with Snowflake Cloud Data Platform.☆12Dec 13, 2024Updated last year
- simulation/RL - multi-agent car parking using reinforcement learning☆12Aug 4, 2024Updated last year
- Repo for the coursera Getting and Cleaning Data Course Project☆11Sep 27, 2015Updated 10 years ago
- Crime correlation anaysis☆10Aug 8, 2018Updated 7 years ago
- ☆10Jan 20, 2025Updated last year
- Codespace with Airflow and the Astro CLI☆11May 23, 2023Updated 2 years ago
- Tracebacks for Humans (in Jupyter notebooks)☆12Dec 30, 2025Updated 2 months ago
- A deep learning based application which is entitled to help the visually impaired people. The application automatically generates the tex…☆12Oct 2, 2020Updated 5 years ago
- Unofficial ontologies for Official Registers of Russian Federal Tax Service☆10Apr 7, 2018Updated 7 years ago
- ☆10May 5, 2022Updated 3 years ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Jul 31, 2023Updated 2 years ago
- RemindMe is a reminder and task-management app designed to help you stay organised and on top of your to-do list.☆16Apr 5, 2024Updated last year
- Asynchronous actions for PySpark☆48Dec 2, 2021Updated 4 years ago
- A Delta Lake reader for Dask☆53Jul 29, 2025Updated 7 months ago
- Comparing HATEOAS implementations with Jersey, Spring and VRaptor☆19Oct 17, 2014Updated 11 years ago
- Python Script For Packet Sniffing☆11Aug 19, 2020Updated 5 years ago
- ☆14Feb 23, 2021Updated 5 years ago
- Python script to get a subtitle file from OpenSubtitles new REST API☆10Nov 25, 2020Updated 5 years ago
- An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.☆13Oct 15, 2020Updated 5 years ago
- HLL Algorithm and Web Scraping sample☆10Sep 29, 2015Updated 10 years ago
- MTProto [de]serialization for Rust☆12May 20, 2019Updated 6 years ago
- Proxy to run security checks against packages from npm☆12Sep 4, 2016Updated 9 years ago
- Bringing up Docker Compose environments for system, integration and performance testing, with support for ScalaTest and Gatling☆11Jul 29, 2021Updated 4 years ago
- ☆11Sep 13, 2022Updated 3 years ago
- Pre-trained Online Contrastive Learning for Insurance Fraud Detection☆12Jul 12, 2024Updated last year