A Directory of Online Newspaper Sources for 70+ Languages
β31Apr 15, 2021Updated 5 years ago
Alternatives and similar repositories for awesome-newspapers
Users that are interested in awesome-newspapers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Korpuslinguistik war noch nie so einfach...β25Feb 18, 2026Updated 3 months ago
- [WWW 2026] πΈ GlotWeb: Web Indexing for Minority Languagesβ17Apr 14, 2026Updated last month
- A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.β20Jul 5, 2024Updated last year
- A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used stβ¦β25Jan 3, 2025Updated last year
- A tool for converting TMX files into bilingual corporaβ19Feb 4, 2020Updated 6 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- β33Oct 11, 2023Updated 2 years ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+β13Oct 18, 2025Updated 7 months ago
- ΨΉ Command line tool that displays Arabic text in terminal.β52Mar 17, 2026Updated 2 months ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheniβ¦β12Dec 15, 2023Updated 2 years ago
- Benson turns a list of URLs into mp3s of the contents of each web page - take control over your reading backlog!β16Oct 30, 2024Updated last year
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.β39Feb 5, 2026Updated 4 months ago
- Code for the paper "Modelling Latent Translations for Cross-Lingual Transfer"β17Nov 22, 2021Updated 4 years ago
- A reddit bot that finds original publish dates on linked articles.β10Nov 30, 2024Updated last year
- β11Mar 19, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Fact Enhanced News Generationβ12Jul 18, 2023Updated 2 years ago
- Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wikiβ28Jul 31, 2024Updated last year
- Extract networks of entities from journalistic reportingβ49Jul 17, 2023Updated 2 years ago
- Library for programmatically creating and rendering barcodesβ25Apr 7, 2025Updated last year
- Go through the list of accepted papers for ICLR in terminal and add them to your reading list.β13Jan 30, 2021Updated 5 years ago
- Crawler based on a modified browser to detect online tracking.β11Jul 19, 2023Updated 2 years ago
- Scraper for German democracy documentsβ45Sep 12, 2023Updated 2 years ago
- DEPRECATED version of SoundFileβ14May 26, 2020Updated 6 years ago
- Generate large textual corpora for almost any language by crawling the webβ13Feb 17, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuanβ14Oct 18, 2022Updated 3 years ago
- BUB : Book Uploader Botβ20Feb 8, 2016Updated 10 years ago
- A repository of sample code designed to help you Tweet random dog factsβ15Sep 23, 2022Updated 3 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languagesβ11Feb 6, 2024Updated 2 years ago
- Code repository accompanying the CHI 2021 Paper titled "Adapting User Interfaces with Model-based Reinforcement Learning"β17Oct 18, 2021Updated 4 years ago
- A web app for translating from one language to another.Almost all languages are available.App also generates an audio file of the translaβ¦β11Feb 19, 2026Updated 3 months ago
- word4num is a versatile tool for encoding numbers into words, applicable for geolocation, phone numbers, postcodes, IPv4 addresses, and mβ¦β12Oct 9, 2024Updated last year
- Building and Using A Seed Corpus for the Human Language Projectβ11Feb 9, 2018Updated 8 years ago
- Agile reading group that worksβ13Feb 2, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- scraping and querying documents for LLMsβ24Oct 6, 2025Updated 8 months ago
- More Information about Features, Deliverables and Publications @β11May 17, 2016Updated 10 years ago
- Curated list of awesome datasets for various table understanding tasksβ19Sep 5, 2025Updated 9 months ago
- SQL and Bash scripts to import the offical Stack Overflow data dump and the SOTorrent data set, to retrieve Stack Overflow references froβ¦β15Sep 14, 2025Updated 8 months ago
- A project to download and process gazettes (Govt Notifications) from Indiaβ30Updated this week
- δ» HAR ζδ»Ά δΈθ½½ζ΄δΈͺη½η«θ΅ζΊβ15Jan 16, 2017Updated 9 years ago
- Searching in-memory corpus with Corpus Query Language (CQL)β19Dec 2, 2024Updated last year