awesome-mlops / awesome-data-management
A curated list of awesome open source tools and commercial products to catalog, version, and manage data π
β29Updated 2 years ago
Alternatives and similar repositories for awesome-data-management:
Users that are interested in awesome-data-management are comparing it to the libraries listed below
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.β11Updated 4 years ago
- Batteries included toolkit for data engineering.β33Updated 2 weeks ago
- Astronomer Vendor Imagesβ12Updated this week
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β54Updated last month
- Common Paper Service Level Agreementβ13Updated 9 months ago
- Data exchange and persistence based on human-readable filesβ22Updated last month
- Functional composable pipelines allowing clean separation of the business logic and its implementationβ11Updated 7 months ago
- Awesome Orchest projects, both official and submitted by the community.β25Updated last year
- Python API, Dynamic source, Dynamic target, N targets, Prometheus exporter, realtime transformation for Singer ETLβ10Updated 4 years ago
- Awesome list of dataops products, open source and resourcesβ24Updated 2 years ago
- Demonstration of how to perform continuous model monitoring on CML using Model Metrics and Evidently.ai dashboardsβ12Updated last month
- π» CLI for reporting events to Faros platformβ14Updated 2 months ago
- This is a demo project to compare two web scrapping frameworks, Playwright and Selenium and using the new Pipelining tool Dagsterβ13Updated 3 years ago
- Generate Hive CREATE TABLE statements from json dataβ10Updated 7 years ago
- Taking Normal Text as Input and Generating SQL commands using the OpenAI's GPT-3β15Updated 4 years ago
- Apache Spark based framework for analysis A/B experimentsβ13Updated 2 months ago
- β10Updated 3 years ago
- Python context manager to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamedβ7Updated 3 months ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clientsβ36Updated last year
- dbt adapter for connecting to MindsDBβ17Updated 8 months ago
- convert natural language into technical diagramsβ12Updated last month
- Plugin for Intake to read from SQL serversβ15Updated last year
- A server code for serving BERT-based models for text classification. It is designed by SerpApi for heavy-load prototyping and production β¦β13Updated 9 months ago
- a toy duckdb based timeseries databaseβ15Updated 4 years ago
- Build a directory full of files into a SQLite databaseβ12Updated last year
- Clone of chatgpt built with Bytewax, Streamlit and NATSβ15Updated last year
- This web scraper is intended to extract data from The Home Depot Website, it could be run locally or in the Apify platform, the latter isβ¦β7Updated 2 years ago
- A set of tools to accelerate work in Jupyter notebooks.β11Updated 4 years ago
- Orchest quickstart pipelineβ18Updated 2 years ago