awesome-mlops / awesome-data-management
A curated list of awesome open source tools and commercial products to catalog, version, and manage data π
β32Updated 2 years ago
Alternatives and similar repositories for awesome-data-management:
Users that are interested in awesome-data-management are comparing it to the libraries listed below
- Awesome Orchest projects, both official and submitted by the community.β25Updated last year
- Functional composable pipelines allowing clean separation of the business logic and its implementationβ11Updated 10 months ago
- Apache Spark based framework for analysis A/B experimentsβ13Updated 5 months ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.β11Updated 4 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β57Updated 3 months ago
- β9Updated 2 months ago
- Build a directory full of files into a SQLite databaseβ12Updated last year
- pysh-db - The Data Science Toolkit (DSK)β13Updated 6 years ago
- Taking Normal Text as Input and Generating SQL commands using the OpenAI's GPT-3β15Updated 4 years ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clientsβ36Updated last year
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β27Updated 2 years ago
- NetworkX-like Python experience for Postgres, SQLite, MongoDB, and Neo4Jβ21Updated last month
- Datasette plugin for authenticating access using API tokensβ11Updated 7 months ago
- a toy duckdb based timeseries databaseβ15Updated 4 years ago
- Cookiecutter for community-maintained Jupyter Docker imagesβ15Updated this week
- Glue is an enterprise data model for the buy side, tailored for Wealth and Asset Managers and covering key entities such as Party, Busineβ¦β22Updated last year
- A conda-smithy repository for python-duckdb.β13Updated 3 weeks ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 3 years ago
- Awesome list of dataops products, open source and resourcesβ24Updated 2 years ago
- Batteries included toolkit for data engineering.β34Updated 3 months ago
- Scrape various open data directories to create an index of what's available out thereβ36Updated 2 months ago
- Git scrapers for scraping the fediverseβ16Updated this week
- Documentation and resources for deploying JupyterHub on Hadoopβ18Updated 5 years ago
- Create embeddings for LLM using the Nomic APIβ23Updated 4 months ago
- Examples of vector DB indexing and query with various vector databases.β12Updated 2 months ago
- Tools for building SQLite databases from files and directoriesβ12Updated last year
- Datasette enrichment for analyzing row data using OpenAI's GPT modelsβ19Updated 10 months ago
- A Pythonic API for Amazon's States Language for defining AWS Step Functionsβ8Updated 2 years ago
- A few end to end examples that use data-describeβ16Updated last year
- A markdown wiki and dashboarding system for Datasetteβ21Updated 3 years ago