π° Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
β1,109Apr 30, 2026Updated last month
Alternatives and similar repositories for newspaper4k
Users that are interested in newspaper4k are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository provides usage examples for the Python module Newspaper3k.β152Jan 2, 2024Updated 2 years ago
- newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:β15,077May 13, 2026Updated last month
- news-please - an integrated web crawler and information extractor for news that just worksβ2,458Apr 14, 2026Updated 2 months ago
- A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response.β971Jan 16, 2026Updated 4 months ago
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XMβ¦β6,087Jun 7, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Converts all website content into a text file for uploading to a custom GPTβ38Jan 18, 2025Updated last year
- A CLI tool that bundles source code files into a single context for LLM promptsβ21Jan 9, 2025Updated last year
- fast python port of arc90's readability tool, updated to match latest readability.js!β2,895Jan 26, 2026Updated 4 months ago
- Text Behind Video. Enjoy it is completely free.β31Feb 15, 2025Updated last year
- Simulate human behavior with mass LLMsβ28Oct 23, 2024Updated last year
- Brofile is a utility app which grants you with a better link handling abilities (works on my machine)β46Jun 4, 2025Updated last year
- A fast and reliable Telegram channel scraper that fetches posts and exports them to JSON.β274Apr 15, 2025Updated last year
- A very simple news crawler with a funny nameβ462Jun 6, 2026Updated last week
- A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.htmlβ910Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around tβ¦β34Mar 14, 2023Updated 3 years ago
- Python scraper based on AIβ27,062Updated this week
- π‘ All-in-one AI framework for semantic search, LLM orchestration and language model workflowsβ12,642Updated this week
- IntelliJ Plugin that offers an infinite canvas to organize code bookmarksβ18May 31, 2025Updated last year
- Turn any document into ready-to-use AI image prompts.β53Sep 3, 2025Updated 9 months ago
- Hector RAG is a modular RAG framework built on PostgreSQL, offering advanced retrieval methods and fusion techniques for AI-driven applicβ¦β60Feb 24, 2025Updated last year
- A browser-based tool for comparing and combining before/after images. No server needed, runs entirely in your browser.β18Jan 13, 2025Updated last year
- Article extraction benchmark: dataset and evaluation scriptsβ373May 29, 2026Updated 2 weeks ago
- gaming market monitor. discover time-sensitive opportunities across multiple platforms.β23Apr 9, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Automatically extract documents from images and perspectively correct them with classic computer-vision algorithms. In maintenance mode. β¦β87Aug 24, 2025Updated 9 months ago
- Automates Telegram message digests using Claude AI for summaries and Replicate API for image generation, sending results to saved messageβ¦β55May 13, 2026Updated last month
- π Intelligent browser header & fingerprint generatorβ1,127Feb 26, 2026Updated 3 months ago
- Web app for reading and analyzing exported WhatsApp chat files with a clean, intuitive interface and powerful search and analyticsβ38Dec 17, 2024Updated last year
- structured outputs for llmsβ13,135Updated this week
- Repurpose your YouTube videos by converting them into blog posts.β175May 1, 2024Updated 2 years ago
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing aβ¦β50,129Updated this week
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pacβ¦β298May 19, 2025Updated last year
- partdec is a command-line utility for multipart downloading and file splitting. Download a file in parts simultaneously.β56Sep 26, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- dynamic YAML-driven URL shortener and command mapper with real-time config updatesβ20Aug 28, 2025Updated 9 months ago
- Script for GoogleNewsβ379Mar 20, 2026Updated 2 months ago
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ68,181Jun 4, 2026Updated last week
- A standalone version of the readability libβ11,252Jan 21, 2026Updated 4 months ago
- VirtualBox Web Control Panel is a lightweight HTTP server script providing a simple web interface to list, control, and interact with Virβ¦β25Apr 15, 2025Updated last year
- Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/β11,175May 22, 2026Updated 3 weeks ago
- Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023β48Sep 1, 2024Updated last year