One ETL tool to rule them all
☆87Mar 18, 2026Updated this week
Alternatives and similar repositories for onetl
Users that are interested in onetl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python client for MLflow REST API☆35Mar 11, 2024Updated 2 years ago
- Convert MUSE from TensorFlow to PyTorch and ONNX☆11May 22, 2024Updated last year
- ETL processing toolset with SQL-like language and GIS capabilities, built on core Spark. Extensible and modular. REPL included☆16Jan 26, 2026Updated last month
- RecTools - library to build Recommendation Systems easier and faster than ever before☆430Mar 12, 2026Updated last week
- Incan: a modern, Pythonic language that compiles to Rust! Type-safe, async-friendly, with fixtures, testing, and web/inter-op built in.☆16Mar 15, 2026Updated last week
- Various data stream/batch process demo with Apache Scala Spark 🚀☆12Feb 28, 2020Updated 6 years ago
- ☆21Mar 26, 2023Updated 2 years ago
- InSales e-commerce platform API bindings☆14Jul 13, 2024Updated last year
- Toolkit for Agile-driven data modeling and data loading using highly Normalized hybrid Model☆23Dec 24, 2024Updated last year
- Our style guide for writing readable and maintainable PySpark code.☆17Dec 21, 2021Updated 4 years ago
- Simple python script that converts all Excel files (xls, xlsx, xlsm, csv) in a directory into xlsb files.☆10Mar 13, 2023Updated 3 years ago
- ☆10Jun 27, 2023Updated 2 years ago
- ☆29Jan 18, 2023Updated 3 years ago
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆28Jun 20, 2025Updated 9 months ago
- Make GNN easy to start with☆134Mar 10, 2026Updated 2 weeks ago
- ☆29Feb 21, 2022Updated 4 years ago
- CalData infrastructure☆24Mar 16, 2026Updated last week
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- Telegram Mini Apps — LLM Developer Guide (llms.txt)☆34Jan 30, 2026Updated last month
- ☆16Nov 29, 2024Updated last year
- A Python Library to support running data quality rules while the spark job is running⚡☆201Updated this week
- Greengage DB is an open source MPP database platform based on Greenplum® database software.☆74Updated this week
- Analytics Engineer Course☆20May 17, 2023Updated 2 years ago
- Try out Apache Cloudberry (Incubating) via the Docker-based Sandbox☆20Nov 25, 2025Updated 3 months ago
- Header-only C++/python library for fast approximate nearest neighbors☆18Feb 9, 2020Updated 6 years ago
- Toy Hadoop cluster combining various SQL-on-Hadoop variants☆13Nov 16, 2017Updated 8 years ago
- ☆12Jul 27, 2021Updated 4 years ago
- ☆19Feb 27, 2025Updated last year
- This is an official repository of the Kion Movies Recommendation Dataset.☆12Sep 2, 2022Updated 3 years ago
- Data Engineer RoadMap☆34Jun 3, 2022Updated 3 years ago
- Репозиторий курса "Modern Storages and Data Warehousing", ПИ, НИУ ВШЭ, 2024☆14Apr 13, 2025Updated 11 months ago
- Simple desktop application for Apache Kafka☆45Mar 28, 2023Updated 2 years ago
- PySpark schema generator☆44Feb 23, 2023Updated 3 years ago
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆25Aug 11, 2023Updated 2 years ago
- Python wrapper for lsm1 extension for sqlite4☆15Feb 27, 2025Updated last year
- ☆22Nov 30, 2022Updated 3 years ago
- Simple audio AE☆13Nov 10, 2024Updated last year
- Simulate screen resolution for Shiny apps☆15Apr 16, 2021Updated 4 years ago
- Tiny R kafka client with rscala☆10Mar 26, 2017Updated 8 years ago