rushitjasani / Wikipedia-Search-EngineLinks
A complete search engine experience built on top of 75 GB Wikipedia corpus with subsecond latency for searches. Results contain wiki pages ordered by TF/IDF relevance based on given search word/s. From an optimized code to the K-Way mergesort algorithm, this project addresses latency, indexing, and big data challenges.
☆19Updated 5 years ago
Alternatives and similar repositories for Wikipedia-Search-Engine
Users that are interested in Wikipedia-Search-Engine are comparing it to the libraries listed below
Sorting:
- This project's aim was to implement various Recommendation Models on Hadoop Framework and to compare their performance.☆25Updated 7 years ago
- Data warehouse implementation for an e-commerce website “Infibeam” that sells digital and consumer electronics.☆19Updated 7 years ago
- Playground for pyspark (RDDs, DStreams) and Apache Airflow. Based on the example of parsing (including incorrectly formated strings) web …☆18Updated 3 years ago
- Big Data webapp using Chicago street congestion, crashes, red light violations, and speed camera violations☆40Updated 4 years ago
- Cyber Security for Big Data and IoT using Machine Learning☆14Updated 6 years ago
- ☆56Updated last year
- This repository contain Data Analysis on Black Friday Sales Data using various Regression ML algorithms☆17Updated 2 months ago
- Credit Card Fraud Detection App built with Streamlit, FastAPI and Docker.☆44Updated 2 years ago
- 4 different Big Datasets joined to get single table for final data analysis. Fraud Detection by taken consideration of different key feat…☆46Updated 4 years ago
- Multi-class classification model for predicting the types of crimes in Toronto☆14Updated last year
- ☆44Updated last year
- ☆33Updated last year
- Project - Data Processing and Analysis in Python Course☆41Updated 6 years ago
- A content based movie recommender system using cosine similarity☆174Updated 11 months ago
- ☆22Updated last year
- A Big Data project leveraging AWS services and Apache frameworks to identify and visualize fraudulent credit card transaction patterns, p…☆16Updated last year
- This repo contains Data Science code snippet☆82Updated 8 months ago
- Hi Everyone Glad to see your interest in this repo and welcome, we will be working on end to end data science project which is "Loan Pred…☆43Updated 2 years ago
- End to End Machine Learning Projects☆17Updated last year
- ☆12Updated 2 years ago
- Bank customers churn dashboard with predictions from several machine learning models.☆53Updated last year
- ☆32Updated 8 months ago
- Big data projects implemented by Maniram yadav☆51Updated 7 years ago
- The notebook files contains the tutorials for web scraping☆24Updated last year
- Machine Learning Web App Built Using Flask Deployed on Heroku☆9Updated 3 years ago
- This documentation is like a quick snapshot of my project in the data field, showing off my skills and know-how in this area.☆53Updated last year
- ☆13Updated 2 years ago
- ☆93Updated last year
- # **ABSTRACT** Main Objective: The main agenda of this project is: Perform extensive Exploratory Data Analys…☆33Updated 3 years ago
- A PowerBI dashboard to analyze raw sales data from a multinational pharmaceutical manufacturing company and get insights into the perform…☆8Updated last year