A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
☆19Oct 16, 2019Updated 6 years ago
Alternatives and similar repositories for Wikipedia-Search-Engine
Users that are interested in Wikipedia-Search-Engine are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Playground for pyspark (RDDs, DStreams) and Apache Airflow. Based on the example of parsing (including incorrectly formated strings) web …☆18Feb 21, 2022Updated 4 years ago
- Data warehouse implementation for an e-commerce website “Infibeam” that sells digital and consumer electronics.☆22Jan 28, 2018Updated 8 years ago
- Big Data webapp using Chicago street congestion, crashes, red light violations, and speed camera violations☆44Jan 9, 2021Updated 5 years ago
- It is an Application that helps a traveler visiting a city to explore and navigate places of their choice in a very Simple way without ty…☆13Dec 26, 2021Updated 4 years ago
- A BigQuery adapter for Harlequin, a SQL IDE for the terminal.☆10Jan 19, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Python Essentials for AWS Cloud Developers, published by Packt.☆11Apr 27, 2023Updated 2 years ago
- This is the LinkedIn Learning repository for Level Up: Python Data Acquisitions, Prep, & EDA.☆15Mar 4, 2025Updated last year
- A Banking System which facilitates transfer of money from one user to another. Created Using HTML, CSS, JavaSript and PHP along with mySQ…☆10Jan 25, 2024Updated 2 years ago
- Project submission for data engineering zoomcamp 2023 - https://github.com/DataTalksClub/data-engineering-zoomcamp☆10Apr 27, 2023Updated 2 years ago
- ☆15May 8, 2025Updated 11 months ago
- ☆17Nov 22, 2022Updated 3 years ago
- End-to-end data engineering pipeline with various technologies to ingest real time data.☆25Nov 3, 2023Updated 2 years ago
- Streamlit Dashboard over Superstore Data stored in Postgres Docker container. With SQLAlchemy + Plotly Express☆12Oct 16, 2024Updated last year
- This is a guided certification project, as a part of Data Science for Social Good initiative☆18Mar 9, 2020Updated 6 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- This repository contains code to build an MVP search engine with google like interface.☆17Mar 25, 2026Updated 3 weeks ago
- This repo is for the Linkedin Learning course: Testing Python Data Science Code☆20Sep 26, 2025Updated 6 months ago
- ☆32Mar 7, 2018Updated 8 years ago
- Case Studies and Projects in Machine Learning/EDA/DL☆24Jun 18, 2024Updated last year
- ☆55Aug 19, 2025Updated 7 months ago
- ☆25Mar 4, 2025Updated last year
- Accident detection system for traffic footage. Using computer vision and ML to detect and analyze accidents in a CCTV footage in real-tim…☆20Jul 14, 2023Updated 2 years ago
- ☆40Jan 4, 2026Updated 3 months ago
- my zsh configuration☆13Jun 26, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This repository contain Data Analysis on Black Friday Sales Data using various Regression ML algorithms☆21Apr 8, 2025Updated last year
- ☆17Feb 11, 2022Updated 4 years ago
- RealTime StockStream is a streamlined, simulation system for processing live stock market data. It uses Apache Kafka for data input, Apac…☆31Feb 18, 2025Updated last year
- ☆30Jan 17, 2023Updated 3 years ago
- Data visualisations in Power BI☆31Nov 14, 2021Updated 4 years ago
- This repository contains the source code for MongoDB Crash Course created by CodeWithHarry☆77Nov 5, 2025Updated 5 months ago
- A simple Dash and Plotly dashboard to review and compare federal economic data☆13Feb 1, 2022Updated 4 years ago
- Implementation of a system capable of encryption and decryption of multimedia data (Text, Images, Videos, Audio etc.) using a hybrid mode…☆22Feb 7, 2024Updated 2 years ago
- MCP server for managing and serving analysis prompt templates☆21Dec 13, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A project to detect accident and send notification to hospitals whenever a accident happens.☆20Mar 22, 2023Updated 3 years ago
- ☆15Jul 31, 2022Updated 3 years ago
- A program written in C++ that emulates a bogus CPU☆22May 6, 2024Updated last year
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆31Nov 9, 2023Updated 2 years ago
- Sign language translation model for the app Look & Tell https://github.com/khooinguyeen/LookandTell-OfficialApp☆26Apr 16, 2023Updated 3 years ago
- Mastering Windows Presentation Foundation, Second Edition, published by Packt☆38Jan 18, 2023Updated 3 years ago
- ☆25Apr 23, 2022Updated 3 years ago