noahgift / rdedupeLinks
A Rust based deduplication tool
☆34Updated 3 weeks ago
Alternatives and similar repositories for rdedupe
Users that are interested in rdedupe are comparing it to the libraries listed below
Sorting:
- Code for a Duke Coursera Rust-based data engineering course☆159Updated 6 months ago
- tutorial for Rust for Enterprise MLOps book by O'Reilly☆40Updated 2 years ago
- Rust PyTorch GPU configuration☆47Updated last year
- A work in progress to build out solutions in Rust for MLOPs☆154Updated 6 months ago
- Data pipeline example written in Rust with Polars and DataFusion DataFrame package☆41Updated 2 years ago
- Journeys between the two worlds of Python 🐍 and Rust 🦀☆40Updated last week
- csv and flat-file sniffer built in Rust.☆42Updated last year
- Demos using Rust Candle☆76Updated 6 months ago
- rust-for-data☆45Updated 2 years ago
- Introduction to Command-line tools with Python and Rust☆29Updated last year
- Cookbook to build Rust Candle models☆81Updated last year
- A good starting point for a new Rust project☆54Updated 6 months ago
- MLOps Deploy Solutions with Rust☆36Updated last year
- Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market☆58Updated 2 years ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- ☆99Updated 2 weeks ago
- 🚕 Self-contained demo using Redpanda, Materialize, River, Redis, and Streamlit to predict taxi trip durations☆45Updated 2 years ago
- Practice ETL with Rust and Polars☆29Updated last year
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆84Updated this week
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Delta reader for the Ray open-source toolkit for building ML applications☆46Updated last year
- Contribute to dlt verified sources 🔥☆87Updated 3 weeks ago
- Serverless for data practitioners. The fastest ⚡️ way to run your code in the cloud. Effortlessly run scripts, functions, and Jupyter not…☆39Updated last year
- Official Python client SDK for Iggy.rs message streaming.☆27Updated 3 weeks ago
- Deploy a distroless Rust API to Azure☆15Updated 2 years ago
- Self-contained demo using Kafka, Materialize and Metabase to check what's streaming on Twitch. All you need is Docker and Twitch access t…☆25Updated 3 years ago
- A declarative PySpark framework for row- and aggregate-level data quality validation.☆50Updated this week
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- Analytics engineering with dbt - projects and developer environment☆19Updated 9 months ago
- ☆30Updated last month