godatadriven / build-your-own-search-engineLinks
This repository contains code to build an MVP search engine with google like interface.
☆15Updated 4 years ago
Alternatives and similar repositories for build-your-own-search-engine
Users that are interested in build-your-own-search-engine are comparing it to the libraries listed below
Sorting:
- A few end to end examples that use data-describe☆16Updated 2 years ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.☆11Updated 4 years ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆33Updated 3 years ago
- ☆10Updated 4 years ago
- Cookiecutter for community-maintained Jupyter Docker images☆15Updated this week
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.☆11Updated 4 months ago
- Events about the open source data stack☆13Updated 3 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- 💻 CLI for reporting events to Faros platform☆14Updated 3 weeks ago
- Repository to allow collaboration between Cycle Labs Cloud community in support of the community.☆9Updated 3 years ago
- Flask based UI for displaying & segmenting a single database table☆15Updated 3 years ago
- This is a real-life, high throughput streaming ELT data pipeline for ecommerce☆13Updated 2 years ago
- Datamallet is a python library which contains several helper functions and module for the common tasks in a typical data science workflow…☆11Updated 3 years ago
- A set of tools to accelerate work in Jupyter notebooks.☆11Updated 5 years ago
- The IBM DB2 adapter plugin for dbt (data build tool)☆11Updated last year
- 🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎☆16Updated 2 years ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆36Updated last year
- Git scrapers for scraping the fediverse☆17Updated this week
- Learn how to build NPL Cognitive Chatbots☆24Updated 5 years ago
- Supported datasources for MindsDB☆16Updated 3 weeks ago
- Multi-docker container data science / engineering playground (w/ Kafka, Airflow, MLFlow, Tensorflow-Keras / SKLearn) for simulating a mic…☆11Updated 2 years ago
- Machine Learning with BigQuery ML, published by Packt☆31Updated 2 years ago
- Material for Talk Python Training course on Getting Started with Dask.☆28Updated 2 years ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆23Updated 2 years ago
- This repository contains all resources (code, notebooks,etc) used for my Medium blog page.☆16Updated 4 months ago
- Prefect integrations for working with OpenAI.☆34Updated last year
- Orchest quickstart pipeline☆18Updated 2 years ago
- NiFi Processor for Apache Pulsar☆10Updated 6 months ago
- Ssebowa is free and open source library in Python that provides generative-ai models.☆14Updated last year
- DataHub on AWS demonstration resources☆10Updated 2 years ago