OlehOnyshchak / pyWikiMMLinks
Collects a multimodal dataset of Wikipedia articles and their images
☆16Updated 2 years ago
Alternatives and similar repositories for pyWikiMM
Users that are interested in pyWikiMM are comparing it to the libraries listed below
Sorting:
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆41Updated 4 years ago
- A database of movie scripts from several sources☆173Updated last year
- Visual similarity search engine demo with use of PyTorch Metric Learning and Qdrant☆12Updated 2 years ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated 2 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- A dataset for pretraining language models targeted for legal tasks.☆137Updated 3 years ago
- Reproducing "Writing with Transformer" demo, using aitextgen/FastAPI in backend, Quill/React in frontend☆28Updated 4 years ago
- Domain-Specific Text Generation for Machine Translation (with LLMs) - scripts and config files for the paper☆17Updated 2 years ago
- This repo is about the classification of rhetorical roles in Legal Documents such as: Citation, Findings of Fact, Evidence, Legal Rule, R…☆15Updated 3 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- Adversarial Training on Transformer Networks to discover check-worthy factual claims☆79Updated last year
- Automatic Text Summarization and Title Generation.☆25Updated 4 years ago
- ☆43Updated 2 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆68Updated 3 weeks ago
- Translate Natural Language Processing to SPARQL Query and vice versa☆51Updated 2 years ago
- ☆13Updated last year
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆124Updated last year
- Tornado is an open source Human-in-the-loop machine learning tool. It helps you label your dataset on the fly while training your model t…☆66Updated 2 years ago
- Extracts iframes or keyframes from a video file, through the command line or from inside python.☆17Updated 2 years ago
- Tools to construct and process Common Crawl webgraphs☆93Updated 3 weeks ago
- A python library for extracting text from PDFs without losing the formatting of the PDF content.☆77Updated 3 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆63Updated 7 months ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- This repository serves as a collection of scrapers procuring and structuring various legal datasets☆18Updated 2 years ago
- GenieNLP: A versatile codebase for any NLP task☆89Updated last year
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Updated 2 years ago
- Template Extraction from unstructured Wikipedia text using NLP techniques.☆41Updated 5 years ago
- Daily TV News Summary using GPT☆24Updated 3 months ago
- Interpretable feature construction from taxonomies for text classification☆18Updated 3 years ago
- ☆24Updated last week