kohjiaxuan / Wikipedia-Article-Scraper
A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.
☆18Updated last year
Related projects: ⓘ
- semantically distinct key phrase extraction using hilbert hashes.☆46Updated 2 years ago
- Tools for scraping YouTube video metadata (mostly for training AI on video titles)☆38Updated 3 years ago
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆35Updated 4 years ago
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆115Updated 3 months ago
- TTS Client for Coqui TTS server☆13Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆114Updated 5 months ago
- Convert epub file to txt☆21Updated last year
- ☆33Updated this week
- An ongoing series of notebooks aimed at helping fellow NLP enthusiasts think about applying new tools and techniques to practical tasks.☆18Updated 3 years ago
- A collection of YouTube videos transcripts : Podcasts (Joe Rogan Experience, Tim Ferris, Jocko podcast, ..), lectures (YaleCourses, MIT l…☆74Updated last week
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆41Updated last year
- The objective of this project is to scrape a corpus of news articles from a set of web pages, pre-process the corpus, and then to apply u…☆50Updated 6 years ago
- Data sourcing and pre-processing for raplyrics.eu - A rap music lyrics generation project☆62Updated 2 months ago
- Lyric Generation using AI☆12Updated 5 years ago
- A repo with scripts to test and play around with Facebook's recent llama models! 🤗☆29Updated last year
- Conditional lyrics generator -> pre-trained GPT2 model fine-tuned on lyrics with features dataset.☆40Updated 4 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆61Updated last month
- Python library for downloading closed captions(subtitles) from Youtube☆57Updated last year
- Ethical, legal, and effortless extraction of Reddit data in your database☆46Updated 3 months ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆24Updated last year
- Automatic Text Summarization and Title Generation.☆25Updated 3 years ago
- Unreliable News Index (for Columbia Journalism Review)☆55Updated 2 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆96Updated last year
- Minute Meeting Bot☆18Updated last year
- Download subreddit comments☆90Updated 2 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆61Updated 4 years ago
- This script will dump youtube video comments to a CSV from youtube video links. Video links can be placed inside a variable or list or CS…☆36Updated 2 years ago
- ☆55Updated last year
- Rhyme with AI☆39Updated 4 years ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆46Updated 2 years ago