OlehOnyshchak / pyWikiMMLinks
Collects a multimodal dataset of Wikipedia articles and their images
☆16Updated 2 years ago
Alternatives and similar repositories for pyWikiMM
Users that are interested in pyWikiMM are comparing it to the libraries listed below
Sorting:
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- Adversarial Training on Transformer Networks to discover check-worthy factual claims☆81Updated last year
- FactNews is the first dataset to predict sentence-level factuality of news reporting. Furthemore, we provide baseline results for sentenc…☆10Updated 4 months ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Updated 2 years ago
- An easy-to-use API for analyzing INCEpTION annotation projects.☆17Updated 2 years ago
- A News Article Collection Library☆22Updated 2 years ago
- An easy-to-use library to extract indices from texts.☆29Updated 4 years ago
- Reproducing "Writing with Transformer" demo, using aitextgen/FastAPI in backend, Quill/React in frontend☆27Updated 4 years ago
- A pipeline using LLMs for Knowledge Engineering, combining knowledge probing and Wikidata entity mapping.☆37Updated 10 months ago
- A tool to easily scrape youtube data using the Google API☆12Updated 6 months ago
- Domain-Specific Text Generation for Machine Translation (with LLMs) - scripts and config files for the paper☆17Updated 2 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- A database of movie scripts from several sources☆179Updated last year
- TimeLMs: Diachronic Language Models from Twitter☆111Updated last year
- A dataset for pretraining language models targeted for legal tasks.☆138Updated 3 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- ☆70Updated last year
- This repository serves as a collection of scrapers procuring and structuring various legal datasets☆17Updated 2 years ago
- GenieNLP: A versatile codebase for any NLP task☆88Updated last year
- Modelling Big Five Personality Inventory using Machine Learning algorithms☆22Updated 11 months ago
- SummScreen: A Dataset for Abstractive Screenplay Summarization (ACL 2022)☆38Updated 3 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆65Updated 9 months ago
- Tools to construct and process Common Crawl webgraphs☆101Updated last week
- A guide to structured generation using constrained decoding☆11Updated last year
- A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and…☆20Updated 2 years ago
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆77Updated last year
- Question Answering annotation platform - Plateforme d'annotation☆90Updated 9 months ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆189Updated 5 months ago
- This project was the part of the competition Identify Characters From Product Images hosted by CrowdAnalytix☆10Updated 3 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆68Updated 2 months ago