π° Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
β1,031Mar 1, 2026Updated this week
Alternatives and similar repositories for newspaper4k
Users that are interested in newspaper4k are comparing it to the libraries listed below
Sorting:
- newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:β14,996Dec 6, 2025Updated 2 months ago
- This repository provides usage examples for the Python module Newspaper3k.β151Jan 2, 2024Updated 2 years ago
- news-please - an integrated web crawler and information extractor for news that just worksβ2,391Sep 21, 2025Updated 5 months ago
- A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response.β942Jan 16, 2026Updated last month
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XMβ¦β5,402Sep 12, 2025Updated 5 months ago
- Converts all website content into a text file for uploading to a custom GPTβ38Jan 18, 2025Updated last year
- Text Behind Video. Enjoy it is completely free.β31Feb 15, 2025Updated last year
- A CLI tool that bundles source code files into a single context for LLM promptsβ21Jan 9, 2025Updated last year
- Brofile is a utility app which grants you with a better link handling abilities (works on my machine)β46Jun 4, 2025Updated 9 months ago
- A fast and reliable Telegram channel scraper that fetches posts and exports them to JSON.β265Apr 15, 2025Updated 10 months ago
- Automatically extract documents from images and perspectively correct them with classic computer-vision algorithms. In maintenance mode. β¦β86Aug 24, 2025Updated 6 months ago
- dynamic YAML-driven URL shortener and command mapper with real-time config updatesβ20Aug 28, 2025Updated 6 months ago
- Web app for reading and analyzing exported WhatsApp chat files with a clean, intuitive interface and powerful search and analyticsβ36Dec 17, 2024Updated last year
- Python scraper based on AIβ22,786Feb 24, 2026Updated last week
- Hector RAG is a modular RAG framework built on PostgreSQL, offering advanced retrieval methods and fusion techniques for AI-driven applicβ¦β60Feb 24, 2025Updated last year
- A toolkit for Visual Cryptography and Random Grid schemesβ90Mar 27, 2025Updated 11 months ago
- IntelliJ Plugin that offers an infinite canvas to organize code bookmarksβ18May 31, 2025Updated 9 months ago
- Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023β46Sep 1, 2024Updated last year
- π‘ All-in-one AI framework for semantic search, LLM orchestration and language model workflowsβ12,247Feb 25, 2026Updated last week
- An automated discovery engine that monitors multiple platforms to capture high-value, time-sensitive opportunities in the digital gaming β¦β25Apr 9, 2025Updated 10 months ago
- Turn any document into ready-to-use AI image prompts.β54Sep 3, 2025Updated 6 months ago
- Swap your face in real-timeβ74Mar 25, 2025Updated 11 months ago
- Automates Telegram message digests using Claude AI for summaries and Replicate API for image generation, sending results to saved messageβ¦β54Mar 10, 2025Updated 11 months ago
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.htmlβ902Feb 6, 2026Updated 3 weeks ago
- rec-all: A Time Machine for the Everydayβ17Dec 5, 2024Updated last year
- A very simple news crawler with a funny nameβ437Feb 25, 2026Updated last week
- fast python port of arc90's readability tool, updated to match latest readability.js!β2,890Jan 26, 2026Updated last month
- YouTube History Analyzerβ37Jan 31, 2026Updated last month
- Cross-platform Search Engine and File Explorer for Multimediaβ31Feb 16, 2025Updated last year
- π Intelligent browser header & fingerprint generatorβ985Feb 26, 2026Updated last week
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing aβ¦β37,083Updated this week
- structured outputs for llmsβ12,468Feb 25, 2026Updated last week
- Terminal on Browserβ28Jun 28, 2025Updated 8 months ago
- VirtualBox Web Control Panel is a lightweight HTTP server script providing a simple web interface to list, control, and interact with Virβ¦β25Apr 15, 2025Updated 10 months ago
- Article extraction benchmark: dataset and evaluation scriptsβ355Updated this week
- Brower extension to convert web pages to clean Markdown and copy to clipboard so you can feed it to your favorite LLM model as context wiβ¦β285Feb 16, 2026Updated 2 weeks ago
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/β9,928May 8, 2025Updated 9 months ago
- MIT license BRS-XSS is a modular Python CLI scanner for XSS vulnerabilities. Features context-aware payloads, WAF evasion, DOM analysis vβ¦β34Jan 12, 2026Updated last month
- CrawleeβA web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Dowβ¦β8,151Updated this week