kohjiaxuan/Wikipedia-Article-Scraper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kohjiaxuan/Wikipedia-Article-Scraper)

kohjiaxuan / Wikipedia-Article-Scraper

A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.

☆21

Alternatives and similar repositories for Wikipedia-Article-Scraper

Users that are interested in Wikipedia-Article-Scraper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yustira / crowd-counting
View on GitHub
☆20Jun 22, 2020Updated 6 years ago
aparnadutta / code-mixed-lid
View on GitHub
Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.
☆10Aug 13, 2023Updated 2 years ago
Gilbertly / cf-next-hono
View on GitHub
Nextjs starter template with Honojs, deployed to Cloudflare.
☆10Apr 10, 2024Updated 2 years ago
QuwsarOhi / BanglaWriting
View on GitHub
BanglaWriting: A multi-purpose offline Bangla handwriting dataset
☆14Nov 18, 2020Updated 5 years ago
jeremeamia / sunshinephp-guzzle-examples
View on GitHub
Example code for my SunshinePHP Guzzle Tutorial
☆10Feb 5, 2015Updated 11 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
alperencantez / GoogleMapsScraper
View on GitHub
⛏️ Scrapes data from Google Maps businesses using Playwright by simulating user search. A free alternative to Google's map API.
☆15Jan 17, 2024Updated 2 years ago
doldsimo / solid-quiz
View on GitHub
Simple quiz component for solidjs and solid-start.
☆11Mar 13, 2026Updated 4 months ago
haroldo-ok / BlocklyVN8bit
View on GitHub
This is a mashup between BlocklyVN32X and 8Bit-Unity. It allows you to make Visual Novels for classic 8bit computers and consoles, using …
☆16Sep 24, 2024Updated last year
ncesar / puppeteer-recaptcha-whisper
View on GitHub
Solving recaptcha using Puppeteer and OpenAI Whisper Model
☆13Jan 8, 2025Updated last year
NorseByte / opensource-tracker
View on GitHub
Opensource scraper for analyse of social networks. Create nodes with egdes for you to visualize on editors like gephi.
☆11Dec 2, 2025Updated 7 months ago
tekkamanendless / umactually
View on GitHub
This repo contains the stats for the College Humor show "Um, Actually..." as well as a simple HTML page to view those stats. The page is…
☆11Jul 17, 2026Updated last week
Xonshiz / SolidFiles-Downloader
View on GitHub
This little python script downloads the content from solidfiles. The reason I came up with this is 'SolidFiles using too much Pop Ups'. J…
☆14Dec 20, 2018Updated 7 years ago
piEsposito / transformers-low-code-experiments
View on GitHub
Low-code pre-built pipelines for experiments with huggingface/transformers for Data Scientists in a rush.
☆16Oct 14, 2020Updated 5 years ago
gantrol / MarkdownCanDo
View on GitHub
☆11Mar 26, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mrWh1te / Botmation
View on GitHub
A simple TypeScript framework for declaratively composing bots with Puppeteer
☆19May 26, 2026Updated 2 months ago
globalwordnet / gwadoc
View on GitHub
documentation for things like relations and parts of speech used by wordnets
☆15Jun 18, 2024Updated 2 years ago
piyushmakhija5 / hinglishNorm
View on GitHub
A Hindi-English Dataset for Text Normalization
☆18Jan 3, 2022Updated 4 years ago
Autoparallel / learner
View on GitHub
Making learning sh*t less annoying
☆43Feb 1, 2025Updated last year
unsplash / tinplate
View on GitHub
⭐️ TinEye API wrapper
☆14May 15, 2025Updated last year
TinyStuff / TinyHttpClientPool
View on GitHub
A HttpClient manager that allows cool stuff to happen
☆11Jan 2, 2018Updated 8 years ago
yoheinakajima / captainaction
View on GitHub
Collections of Actions for Custom GPTs (some created by Captain Action)
☆11Jan 7, 2024Updated 2 years ago
twardoch / totw-fonts
View on GitHub
TOTW (Top of the WOFFs) — collection of open-source OpenType fonts curated by Adam Twardoch
☆12Jun 29, 2025Updated last year
jenstornell / tinyDrawer.js
View on GitHub
Really small mobile menu navigation sliding in from the left
☆15Jun 22, 2019Updated 7 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Divoolej / picobu
View on GitHub
The missing PICO-8 source code bundler. PicoBu(ild)! 🦀
☆12Aug 3, 2019Updated 6 years ago
sr1jan / ytGREP
View on GitHub
A simple chrome extension to search for words or sentences used in YouTube videos.
☆13Feb 14, 2022Updated 4 years ago
swyxio / ai-engineer
View on GitHub
AI Engineer website
☆10Jun 22, 2023Updated 3 years ago
arijitx / BanglaNLP
View on GitHub
Resources and Tool for Bangla language computation
☆14Feb 20, 2026Updated 5 months ago
Correia-jpv / fucking-awesome-ios-ui
View on GitHub
A curated list of awesome iOS UI/UX libraries. With repository stars⭐ and forks🍴
☆12Updated this week
Wojtab / minigpt-4-pipeline
View on GitHub
☆16Jun 6, 2023Updated 3 years ago
SilverCrow2323 / Renpy-Vita-Portings
View on GitHub
List of possibile Ren'Py games (mostly VN) to be ported with renpy-vita.
☆13Apr 8, 2022Updated 4 years ago
jonocairns / aubri
View on GitHub
Plex for audiobooks
☆11Dec 5, 2022Updated 3 years ago
NoxMoon / inside_beauty
View on GitHub
dig into the ingredient in beauty products; springboard capstone project1
☆21Feb 20, 2019Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
christophschuhmann / 4MC-4M-Image-Text-Pairs-with-CLIP-embeddings
View on GitHub
I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…
☆17Apr 22, 2021Updated 5 years ago
Mouez-Yazidi / WhisperMesh
View on GitHub
WhisperMesh is an advanced chatbot that integrates voice and text interactions, delivering personalized responses through LLM models and …
☆16Apr 23, 2025Updated last year
aniket-work / AI_Powered_Dev_Search_Engine
View on GitHub
AI_Powered_Dev_Search_Engine
☆12Mar 10, 2024Updated 2 years ago
LYNXware / LYNXapp__version_3
View on GitHub
☆11Apr 28, 2024Updated 2 years ago
imranulashrafi / banner
View on GitHub
Pytorch implementation for paper 'BANNER: A Cost-Sensitive Contextualized Model for Bangla Named Entity Recognition'
☆13Apr 15, 2020Updated 6 years ago
clouedoc / puppeteer-extra-plugin-session
View on GitHub
Session persistence plugin for puppeteer-extra
☆21Sep 24, 2022Updated 3 years ago
Curly-Mo / swingify
View on GitHub
Make any song swing
☆10Feb 22, 2021Updated 5 years ago