awesome-mlops / awesome-data-management
A curated list of awesome open source tools and commercial products to catalog, version, and manage data π
β30Updated 2 years ago
Alternatives and similar repositories for awesome-data-management:
Users that are interested in awesome-data-management are comparing it to the libraries listed below
- Build a directory full of files into a SQLite databaseβ12Updated last year
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.β11Updated 4 years ago
- Batteries included toolkit for data engineering.β33Updated last month
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β55Updated 2 months ago
- Awesome Orchest projects, both official and submitted by the community.β25Updated last year
- Functional composable pipelines allowing clean separation of the business logic and its implementationβ11Updated 8 months ago
- Data exchange and persistence based on human-readable filesβ22Updated 2 months ago
- A set of tools to accelerate work in Jupyter notebooks.β11Updated 4 years ago
- Orchest quickstart pipelineβ18Updated 2 years ago
- Cookiecutter for community-maintained Jupyter Docker imagesβ14Updated 2 weeks ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 3 years ago
- Awesome list of dataops products, open source and resourcesβ24Updated 2 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β27Updated 2 years ago
- Provide an easy way with Python to protect your data sources by searching its metadata. π‘οΈβ16Updated this week
- pglineage is a tool to create data flow diagrams for PostgreSQL by analyzing SQLβ15Updated 10 months ago
- Create embeddings for LLM using the Nomic APIβ22Updated 2 months ago
- Repository to allow collaboration between Cycle Labs Cloud community in support of the community.β9Updated 3 years ago
- Demonstration of how to perform continuous model monitoring on CML using Model Metrics and Evidently.ai dashboardsβ12Updated 2 months ago
- Scrape various open data directories to create an index of what's available out thereβ36Updated this week
- Tools for building SQLite databases from files and directoriesβ12Updated last year
- Glue is an enterprise data model for the buy side, tailored for Wealth and Asset Managers and covering key entities such as Party, Busineβ¦β21Updated last year
- β10Updated 3 years ago
- a graph definition and execution library for pythonβ16Updated last year
- My dot files in one place - extensively edited over time. Your mileage may varyβ2Updated 8 years ago
- bamboolib - template for creating your own binder notebookβ21Updated 3 years ago
- Apache Spark based framework for analysis A/B experimentsβ13Updated 3 months ago
- pip installable duckdb extensions published to pypiβ16Updated last week
- Git scrapers for scraping the fediverseβ14Updated this week
- BoilingData JS client (NodeJS and Browsers)β19Updated 4 months ago