gambolputty/newscorpus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gambolputty/newscorpus)

gambolputty / newscorpus

A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.

☆20

Alternatives and similar repositories for newscorpus

Users that are interested in newscorpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liao961120 / concordancer
View on GitHub
Searching in-memory corpus with Corpus Query Language (CQL)
☆19Dec 2, 2024Updated last year
datadotworld / foia-app
View on GitHub
R Shiny App created to predict the success rate of Freedom of Information Act requests.
☆16Dec 11, 2017Updated 8 years ago
okfde / froide-govplan
View on GitHub
Basis of FragDenStaat.de's „Koalitionstracker“
☆15Jul 14, 2025Updated last year
okfde / wahldaten
View on GitHub
Repo für alle möglichen Wahldaten
☆42Jul 7, 2017Updated 9 years ago
wcmc-its / vivodashboard
View on GitHub
VIVO Dashboard - a semantic application for visualizing publication data
☆21Apr 5, 2019Updated 7 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
opensanctions / storyweb
View on GitHub
Extract networks of entities from journalistic reporting
☆49Jul 17, 2023Updated 3 years ago
andreburgaud / robotspy
View on GitHub
Alternative robots parser module for Python
☆22Jun 19, 2026Updated last month
derhuerst / vbb-modules
View on GitHub
List of JavaScript modules for Berlin & Brandenburg public transport.
☆70Oct 11, 2024Updated last year
okfde / api.offenegesetze.de
View on GitHub
⚙️ Das Backend zu OffeneGesetze.de
☆25Jan 11, 2024Updated 2 years ago
dataresearchcenter / investigraph
View on GitHub
etl pipeline, graphical explorer and general toolbox for investigations with follow the money data
☆28Jul 15, 2025Updated last year
KorAP / Koral
View on GitHub
Translation of query languages to serialized KoralQuery protocol
☆15Updated this week
ausgerechnet / cwb-ccc
View on GitHub
Python wrapper for the CWB to extract concordances and score frequency lists
☆22May 11, 2026Updated 2 months ago
hizkifw / bong
View on GitHub
ChatGPT with access to the internet
☆25Jun 16, 2023Updated 3 years ago
KorAP / Tokenizer-Evaluation
View on GitHub
Benchmark scripts for comparing different tokenizers and sentence segmenters of German
☆12Feb 27, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CommonsDev / glutton
View on GitHub
A Linked Data Platform (LDP) Server in Python
☆13Apr 24, 2015Updated 11 years ago
oldm / OldMan
View on GitHub
Python OLDM (Object Linked Data Mapper)
☆15Jan 5, 2016Updated 10 years ago
dragnet-org / dragnet_data
View on GitHub
code and data used to build a training dataset for dragnet models
☆10Nov 29, 2020Updated 5 years ago
flopp / unicode
View on GitHub
A Flask-Based Web-App for Exploring Unicode
☆11Jan 31, 2024Updated 2 years ago
originell / smaz-py3
View on GitHub
Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+
☆13Oct 18, 2025Updated 9 months ago
alea-institute / nupunkt
View on GitHub
Next-generation Punkt sentence boundary detection with zero dependencies
☆32Nov 18, 2025Updated 8 months ago
hiredscorelabs / seqtolang
View on GitHub
Multi-Langauge Identification
☆28Jul 25, 2024Updated 2 years ago
SOLOPlugins-PocketMine / SMarket
View on GitHub
아이템 액자 or 표지판 상점입니다.
☆14Jan 21, 2019Updated 7 years ago
lenakmeth / Wikinflection-Corpus
View on GitHub
The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…
☆12Dec 15, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
timoteostewart / benson
View on GitHub
Benson turns a list of URLs into mp3s of the contents of each web page - take control over your reading backlog!
☆16Oct 30, 2024Updated last year
ddelange / retrie
View on GitHub
Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing
☆76Jul 1, 2026Updated 3 weeks ago
FixMyBerlin / fixmy.frontend
View on GitHub
FixMyBerlin - A Mobility Platform for Berlin
☆19Jan 23, 2025Updated last year
mhausenblas / mrlin
View on GitHub
mrlin is 'MapReduce processing of Linked Data' … because it's magic
☆16Nov 1, 2012Updated 13 years ago
unt-libraries / django-name
View on GitHub
Name Authority App written for Django
☆13Feb 11, 2026Updated 5 months ago
imranghory / pson
View on GitHub
Python library to make querying JSON-like structures easy!
☆17Feb 2, 2014Updated 12 years ago
AlexandraKapp / 30daymapchallenge
View on GitHub
☆28Nov 30, 2020Updated 5 years ago
internetarchive / sandcrawler
View on GitHub
Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki
☆28Jul 31, 2024Updated last year
ruby-rdf / rdf-tabular
View on GitHub
Tabular Data RDF Reader and JSON serializer
☆21Oct 9, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
fadmaa / grefine-ckan-storage-extension
View on GitHub
Upload data directly from within Google Refine to CKAN using CKAN storage API
☆16Jan 31, 2014Updated 12 years ago
ActiveTriples / linked-data-fragments
View on GitHub
Basic linked data fragments endpoint.
☆15Apr 20, 2017Updated 9 years ago
cisnlp / GlotWeb
View on GitHub
[WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages
☆17Apr 14, 2026Updated 3 months ago
okfde / dokukratie
View on GitHub
Scraper for German democracy documents
☆46Sep 12, 2023Updated 2 years ago
Digital-History-Berlin / Python-fuer-Historiker-innen
View on GitHub
The Jupyter Book is aimed at historians who are looking for a first interactive introduction to the Python programming language in German…
☆14Jul 21, 2022Updated 4 years ago
nilportugues / symfony-hal-json
View on GitHub
HAL+JSON API Transformer Bundle for Symfony 2 and Symfony 3
☆10Mar 23, 2017Updated 9 years ago
SolumDeSignum / recomposer
View on GitHub
A Laravel package to ReCompose your installed packages, their dependencies, your app & server environment
☆12Oct 18, 2024Updated last year