yashwordlife / SportsDataAnalysis
a Hadoop Map Reduce application that retrieves data/articles related to sports from sources like NY Times, Commoncrawl, and Twitter and creates a word cloud of most frequently occurring words. Python scripts are developed for gathering data and processing on a Hadoop MR infrastructure. Angular with D3.js is used to create an interactive web app …
☆12Updated 5 years ago
Alternatives and similar repositories for SportsDataAnalysis:
Users that are interested in SportsDataAnalysis are comparing it to the libraries listed below
- Run streamlit web application, test and deploy to a cloud service (GCP, AWS, Heroku)☆14Updated 2 years ago
- Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.☆73Updated 2 years ago
- ☆16Updated 4 years ago
- A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector database☆54Updated 7 months ago
- Streamlit application to keep GPT3 Experimentation sane☆23Updated 3 years ago
- LinkRun - Data Engineering project done in 3 weeks during the Insight fellowship☆38Updated 4 years ago
- A series of notebooks demonstrating how to build simple NLP web apps with Gradio and Hugging Face transformers☆45Updated 3 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated last year
- Text summarization algorithm for the Capstone Project at Springboard code bootcamp☆54Updated 2 years ago
- A Sample repo using the Apriori and FP Growth algorithms to produce categories for queries, and BERT for PoP change visualization.☆39Updated 2 years ago
- Production Machine Learning Pipeline for Text Classification with fastText☆32Updated 3 years ago
- Expose a Top2Vec model with a REST API.☆89Updated 2 years ago
- ☆17Updated 3 years ago
- App store search example, using Jina as backend and Streamlit as frontend☆21Updated 2 years ago
- Python and selenium based (mobile) Facebook groups scraper, independent of obfuscated css selectors.☆11Updated 4 years ago
- ☆11Updated 3 years ago
- Extract social media links and account names from websites.☆38Updated 4 years ago
- Source real estate prices from the Common Crawl.☆27Updated 6 years ago
- URL articles text summarizer using Web Crawling and NLP (written in Python)☆47Updated 4 years ago
- Companion Repo for the book The Applied ML Field Manual, Prithiviraj Damodaran☆12Updated 2 years ago
- Package that returns a company embedding given a company name☆45Updated 4 years ago
- ☆119Updated 2 years ago
- Streamlit-based Web App for Ai Text Generation based on GPT-2 Models from HuggingFace Model Hub using Python library aitextgen☆27Updated 4 years ago
- Exploring Common-Crawl using Python and DynamoDB☆33Updated 7 years ago
- Various Jupyter notebooks about Common Crawl data☆51Updated last month
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆34Updated last year
- TensorFlow Serving + Streamlit!☆22Updated 3 years ago
- Index Common Crawl archives in tabular format☆112Updated 2 weeks ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆21Updated 2 years ago
- ☆22Updated 3 years ago