godatadriven / build-your-own-search-engineLinks
This repository contains code to build an MVP search engine with google like interface.
☆15Updated last month
Alternatives and similar repositories for build-your-own-search-engine
Users that are interested in build-your-own-search-engine are comparing it to the libraries listed below
Sorting:
- A few end to end examples that use data-describe☆16Updated 2 years ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.☆11Updated 4 years ago
- Events about the open source data stack☆13Updated 3 years ago
- 💻 CLI for reporting events to Faros platform☆14Updated 2 months ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆33Updated 3 years ago
- Supported datasources for MindsDB☆16Updated 2 months ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆36Updated last year
- This project is created to promote and advocate the use of FOSS machine learning.☆46Updated 2 months ago
- Using the Parquet file format with Python☆15Updated last year
- ☆10Updated 4 years ago
- 🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎☆16Updated 2 years ago
- Library of Prefect tasks and utilities.☆9Updated 9 months ago
- Data exchange and persistence based on human-readable files☆22Updated 7 months ago
- Repository to allow collaboration between Cycle Labs Cloud community in support of the community.☆9Updated 3 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆59Updated last month
- Robotic Process Automation Projects, published by Packt☆35Updated 2 years ago
- A set of tools to accelerate work in Jupyter notebooks.☆11Updated 5 years ago
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.☆11Updated 5 months ago
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- A Data Mesh demo repository☆13Updated 9 months ago
- Asynchronous tasks on the cloud☆21Updated last year
- ☆29Updated last year
- Apache Spark based framework for analysis A/B experiments☆15Updated 8 months ago
- This is a real-life, high throughput streaming ELT data pipeline for ecommerce☆13Updated 2 years ago
- How to do data science with Optimus, Spark and Python.☆19Updated 5 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆19Updated 6 years ago
- Awesome Orchest projects, both official and submitted by the community.☆25Updated last year
- This repo contains the LookML for the model and dashboards used with the FHIR healthcare dataset to showcase how Looker can add value to …☆12Updated 2 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago