kohjiaxuan / Wikipedia-Article-Scraper
A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Wikipedia-Article-Scraper
- A collection of YouTube videos transcripts : Podcasts (Joe Rogan Experience, Tim Ferris, Jocko podcast, ..), lectures (YaleCourses, MIT l…☆75Updated this week
- Data sourcing and pre-processing for raplyrics.eu - A rap music lyrics generation project☆61Updated 4 months ago
- Lyric Generation using AI☆12Updated 5 years ago
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆42Updated last year
- semantically distinct key phrase extraction using hilbert hashes.☆48Updated 2 years ago
- downloads and parses subtitle dataset from opensubtitles.org☆15Updated 7 months ago
- Detects rhyme schemes in poetry or lyrics using LSTMs.☆37Updated last year
- Conditional lyrics generator -> pre-trained GPT2 model fine-tuned on lyrics with features dataset.☆40Updated 4 years ago
- Ethical, legal, and effortless extraction of Reddit data in your database☆56Updated last month
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆46Updated 2 years ago
- Reproducing "Writing with Transformer" demo, using aitextgen/FastAPI in backend, Quill/React in frontend☆28Updated 3 years ago
- Dolores is a Python library designed to improve the developer experience when working with pretrained language models. Dolores provides p…☆34Updated 4 years ago
- Download YouTube video description and video comments without using the YouTube API.☆151Updated 6 months ago
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆121Updated 5 months ago
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆29Updated 3 years ago
- Download subreddit comments☆91Updated 2 years ago
- Neural Network Language Model that generates text based off Lord of the Rings. Built with Pytorch.☆37Updated 3 weeks ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆32Updated last year
- clustering news, extract trending news stories☆12Updated 3 years ago
- Unlock AI power with AudioInsightsGenerator! From audio to summaries, emotion analysis, idea generation, narratives, and content filterin…☆19Updated last year
- A project about learning how to synchronize subtitles in movies using machine learning.☆9Updated last year
- ☆27Updated 3 years ago
- Screenplay Summarization using Latent Narrative Structure☆35Updated 2 years ago
- Application of OpenAI tools such as Whisper, DALL-E, and ChatGPT to generate album covers from audio☆12Updated last year
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆61Updated 4 years ago
- Rhyme with AI☆41Updated 4 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆92Updated last year
- Leverage the power of the Google Natural Language API NLP to retrieve entity relationships from Wikipedia URLs or topics! Get interactive…☆14Updated 3 years ago
- This Python script parses HTML movie scripts, such as the ones found on imsdb.com.☆37Updated 2 years ago