Quickly ingest messy CSV and XLS files. Export to clean pandas, SQL, parquet
☆196Jun 9, 2023Updated 3 years ago
Alternatives and similar repositories for d6tstack
Users that are interested in d6tstack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fuzzy joins for python pandas - easily join different datasets☆59Aug 11, 2020Updated 5 years ago
- Plugin for Intake to read from SQL servers☆15May 29, 2023Updated 3 years ago
- Push and pull data files like code☆175Jul 20, 2023Updated 2 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆24Nov 30, 2020Updated 5 years ago
- Python library for building highly effective data science workflows☆948Jul 20, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- SnapLoc is a product that does automatic image classification and spatio-temporal analysis in order to recommend the places of interest i…☆15Mar 21, 2018Updated 8 years ago
- A Flink applcation that demonstrates reading and writing to/from Apache Kafka with Apache Flink☆20Jul 23, 2023Updated 2 years ago
- Prevent downstream data quality issues by integrating the Soda Library into your CI/CD pipeline.☆17Jan 29, 2026Updated 4 months ago
- Set-oriented Operations in Pandas☆24May 27, 2020Updated 6 years ago
- Simple tool to pull posts and users from Gab☆16May 19, 2026Updated 3 weeks ago
- Mined data from Twitter and classify the users based on their locations and preferences to target them through marketing campaigns.☆13Dec 27, 2020Updated 5 years ago
- A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner☆2,640Mar 20, 2024Updated 2 years ago
- python library to perform Locality-Sensitive Hashing for faster nearest neighbors search in high dimensional data☆19Aug 15, 2024Updated last year
- sqldf for pandas☆1,349Jul 24, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Intake is a lightweight package for finding, investigating, loading and disseminating data.☆1,080May 25, 2026Updated 2 weeks ago
- ERPL is a DuckDB extension to connect to API based ecosystems via standard interfaces like OData, GraphQL and REST. This works e.g. for S…☆27Jun 3, 2026Updated last week
- End to end mlflow with feast example☆18May 18, 2021Updated 5 years ago
- Handle project folder, template and file templates in JupyterLab☆15Nov 14, 2022Updated 3 years ago
- Component for displaying KPI widgets on a Streamlit dashboard☆18Aug 25, 2021Updated 4 years ago
- Intake examples☆34Jun 2, 2023Updated 3 years ago
- Analysis pipeline for quick ML analyses.☆11Oct 4, 2018Updated 7 years ago
- Jupyter Notebooks and other code for Altair-based Interactive UpSet Plots☆31Dec 1, 2021Updated 4 years ago
- Collection of code snippets and utilities for streamlit apps☆22Apr 2, 2020Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Clean APIs for data cleaning. Python implementation of R package Janitor☆1,496Updated this week
- Typed python equivalent for R pipes.☆14Oct 16, 2022Updated 3 years ago
- Raspberry Pi 4 as Plex Media Server with rclone + PlexDrive. A Complete How-To guide☆10Mar 27, 2021Updated 5 years ago
- cookiecutter template for pure Python libraries. As simple as possible. No magic.☆23Apr 1, 2023Updated 3 years ago
- Ensemble Learning Techniques Tutorial with Credit Card Fraud☆10Oct 22, 2017Updated 8 years ago
- Examples for Teiid(http://teiid.org)☆16Jan 14, 2019Updated 7 years ago
- Cloud Dataflow Tutorial for Beginners☆26Mar 11, 2022Updated 4 years ago
- Data Migration for the Blaze Project☆1,006Jul 15, 2022Updated 3 years ago
- Jupyter Cookbook, published by Packt☆14Jan 30, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Spark implementation of Slowly Changing Dimension type 2☆11Jan 8, 2019Updated 7 years ago
- Create Interactive Dashboards with Streamlit and Python Coursera☆10Jun 19, 2020Updated 5 years ago
- ☆10Nov 12, 2022Updated 3 years ago
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆86Jan 12, 2024Updated 2 years ago
- Crestle version of fast.ai courses☆14Nov 22, 2017Updated 8 years ago
- Implements Google Partial Response dictionary pruning in Python☆15Aug 20, 2022Updated 3 years ago
- Partitioning eddy covariance ET using optimal approaches☆10May 11, 2019Updated 7 years ago