ltgoslo / talk-of-norwayLinks
This repository makes available the Talk of Norway (ToN) dataset, a collection of Norwegian parliament speeches from 1998 to 2016. Every speech is richly annotated with metadata pulled from different sources, and augmented with sentence, token, lemma, part-of-speech and morphological feature annotations.
☆31Updated 2 years ago
Alternatives and similar repositories for talk-of-norway
Users that are interested in talk-of-norway are comparing it to the libraries listed below
Sorting:
- The RICardo dataset compiles trade statistics sources of international trade bilateral flows of the 19th century.☆19Updated last month
- Citation Classification using hybrid neural network model for Wikipedia References☆31Updated 3 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 2 weeks ago
- Scripts that clean up OCR and munge Hathi metadata.☆77Updated 8 years ago
- ☆12Updated 3 years ago
- Topic Modeling Workflow in Python☆16Updated 2 years ago
- A Python package for downloading data from the UK Parliament's Data Platform.☆29Updated 5 years ago
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- Platform for journalists to search, analyse, categorise and share unstructured data☆56Updated this week
- Python implementation of the Zeta score for contrastive text analysis☆14Updated 4 years ago
- Scripts to create git repositories for ALTO XML texts, like those from the British Library's scanned documents.☆31Updated 8 years ago
- Extract networks of entities from journalistic reporting☆49Updated 2 years ago
- A Python library for topic modeling and visualization☆67Updated 5 years ago
- ParlaMint: Comparable Parliamentary Corpora☆74Updated 2 months ago
- Amsterdam Content Analysis Toolkit☆46Updated 3 years ago
- CSV inspection☆10Updated 3 years ago
- Work in progress to design data models for UK Parliament☆59Updated this week
- Public client for consuming content from the Media Cloud Online News Archive & Directory.☆78Updated 2 weeks ago
- ☆76Updated this week
- Machine assisted dossiers☆19Updated 8 years ago
- America's most comprehensive dictionary of campaign finance jargon. A free resource created by and for data journalists.☆18Updated last month
- Provide partial dates and retain the date precision through processing☆14Updated 5 months ago
- Project on the history of genre.☆24Updated 5 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆51Updated 8 years ago
- Tools for text tokenization and encoding☆84Updated 4 years ago
- Bill cosponsorship networks in European parliaments.☆17Updated 8 years ago
- Detect and visualize text reuse☆119Updated last year
- Plots various graphs for a series of plaintext files in a directory☆19Updated 9 years ago
- Inspection of tabular (csv, xls-like) files to guess the columns' content☆51Updated this week
- A simple command line interface to the datamade/dedupe library.☆43Updated 3 years ago