TheOnesThatWereAbroad / PodcastSummarization
Text Summarization on Spotify Podcast Transcripts for NLP class at @UNIBO
☆12Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for PodcastSummarization
- ParaNames: A multilingual resource for parallel names☆30Updated 6 months ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆66Updated last year
- A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.☆103Updated 7 months ago
- Natural language understanding benchmarks for Norwegian☆14Updated 10 months ago
- A module to compute textual lexical richness (aka lexical diversity).☆92Updated last year
- Entity linking evaluation and analysis tool☆19Updated this week
- Data for the HIPE 2022 shared task.☆16Updated 11 months ago
- This repository provides the source code used to automatically generate the book summarization datasets described in the paper titled "Ec…☆11Updated last year
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆91Updated last year
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆64Updated 2 years ago
- ☆35Updated last year
- Sentiment Corpus for Swedish 🇸🇪 Norwegian 🇳🇴 Danish 🇩🇰 Finnish 🇫🇮 (and English 🏴)☆15Updated 3 years ago
- Libraries, Archives and Museums (LAM)☆82Updated 2 years ago
- Repo to hold code and track issues for the collection of permissively licensed data☆22Updated 2 weeks ago
- An opinionated NLP research template☆10Updated 2 months ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆153Updated 2 years ago
- Live survey of off-the-shelf language identification tools for python☆26Updated 2 years ago
- Dutch coreference resolution & dialogue analysis using deterministic rules☆21Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 5 months ago
- A monolingual and cross-lingual meta-embedding generation and evaluation framework☆80Updated 2 years ago
- ☆15Updated 3 years ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆53Updated 3 months ago
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated last year
- BERT and ELECTRA models trained on Europeana Newspapers☆36Updated 2 years ago
- Annotated corpus + evaluation metrics for text anonymisation☆51Updated 9 months ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆20Updated last year
- This repository hosts materials from the CLiC-IT 2023 tutorial☆30Updated 5 months ago
- Tool for generating filtered Wikidata RDF exports☆37Updated 2 years ago
- Resource and Tool for Writing System Identification -- LREC 2024☆13Updated 5 months ago
- FrameBERT: Conceptual Metaphor Detection with Frame Embedding Learning. Presented at EACL 2023.☆23Updated 11 months ago