A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
☆19Oct 16, 2019Updated 6 years ago
Alternatives and similar repositories for Wikipedia-Search-Engine
Users that are interested in Wikipedia-Search-Engine are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Playground for pyspark (RDDs, DStreams) and Apache Airflow. Based on the example of parsing (including incorrectly formated strings) web …☆18Feb 21, 2022Updated 4 years ago
- Data warehouse implementation for an e-commerce website “Infibeam” that sells digital and consumer electronics.☆23Jan 28, 2018Updated 8 years ago
- ☆12Jul 22, 2025Updated 10 months ago
- It is an Application that helps a traveler visiting a city to explore and navigate places of their choice in a very Simple way without ty…☆13Dec 26, 2021Updated 4 years ago
- Training a sign language detection model☆11May 10, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- An end-to-end ETL pipeline that extracts weather data, transforms it, and loads it into a PostgreSQL database.☆14Sep 6, 2024Updated last year
- Python Essentials for AWS Cloud Developers, published by Packt.☆12Apr 27, 2023Updated 3 years ago
- A Banking System which facilitates transfer of money from one user to another. Created Using HTML, CSS, JavaSript and PHP along with mySQ…☆10Jan 25, 2024Updated 2 years ago
- ☆15May 8, 2025Updated last year
- Streamlit Dashboard over Superstore Data stored in Postgres Docker container. With SQLAlchemy + Plotly Express☆12Oct 16, 2024Updated last year
- End-to-end data engineering pipeline with various technologies to ingest real time data.☆28Nov 3, 2023Updated 2 years ago
- 📚🧪 Traffic Sentinel is a learning-focused POC that explores a scalable IoT architecture using Fog nodes and Apache Flink to process 📷 …☆28Dec 29, 2025Updated 5 months ago
- Multi-Account Zalo Management — Real-time chat, CRM, appointments, API & webhooks☆127Apr 4, 2026Updated 2 months ago
- ☆31Mar 7, 2018Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Case Studies and Projects in Machine Learning/EDA/DL☆24Jun 18, 2024Updated last year
- weighted category-balanced dataset builder for LLM fine-tuning☆16Feb 21, 2026Updated 3 months ago
- ☆24Jan 6, 2024Updated 2 years ago
- This repo is for linkedin learning course: Complete Guide to SQL for Data Engineering: from Beginner to Advanced☆48Mar 20, 2025Updated last year
- ☆82Aug 19, 2025Updated 9 months ago
- my zsh configuration☆13Jun 26, 2025Updated 11 months ago
- Accident detection system for traffic footage. Using computer vision and ML to detect and analyze accidents in a CCTV footage in real-tim…☆23Jul 14, 2023Updated 2 years ago
- iTASK - Intelligent Traffic Analysis Software Kit☆30Dec 8, 2022Updated 3 years ago
- Data visualisations in Power BI☆31Nov 14, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- RealTime StockStream is a streamlined, simulation system for processing live stock market data. It uses Apache Kafka for data input, Apac…☆31Feb 18, 2025Updated last year
- Implementation of a system capable of encryption and decryption of multimedia data (Text, Images, Videos, Audio etc.) using a hybrid mode…☆22Feb 7, 2024Updated 2 years ago
- MCP server for managing and serving analysis prompt templates☆24Dec 13, 2024Updated last year
- A program written in C++ that emulates a bogus CPU☆22May 6, 2024Updated 2 years ago
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆34Nov 9, 2023Updated 2 years ago
- This dataset contain information of hotel booking, We have performed exploratory data analysis in python to get insight from the data.☆13Apr 12, 2020Updated 6 years ago
- Video surveillance units are usually the first element of a security system. While they are the most intuitive to understand and can be p…☆35Oct 18, 2014Updated 11 years ago
- ☆16Jul 31, 2022Updated 3 years ago
- The AI-powered CLI Assistant☆30May 24, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Tesla/Nasdaq USD Prediction with Artificial Intelligence RNN Neural Network☆13Apr 11, 2022Updated 4 years ago
- An ETL pipeline that extracts weather and air quality data from public APIs, transforms the data into a clean, analyzable format, and loa…☆46Sep 21, 2024Updated last year
- Snowflake Data Engineering in Action☆41Oct 18, 2024Updated last year
- This is a Messenger App, made with react, styled with the help of material UI, and deployed with the help of firebase. 💭🖥️☆19Apr 10, 2022Updated 4 years ago
- MachineHack is an online platform for Machine Learning competitions. We host toughest business problems that can now find solutions in Ma…☆19Oct 25, 2023Updated 2 years ago
- ☆33Mar 2, 2026Updated 3 months ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆51Dec 4, 2023Updated 2 years ago