commoncrawl / cc-citations
Scientific articles using or citing Common Crawl data
☆11Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for cc-citations
- ☆29Updated 10 years ago
- wrapper for the crossref events api☆17Updated last year
- Tools to construct and process webgraphs from Common Crawl data☆79Updated 3 weeks ago
- A News Article Collection Library☆22Updated last year
- Transforming textual descriptions into process models using deep learning☆12Updated 5 years ago
- Various Jupyter notebooks about Common Crawl data☆46Updated 2 years ago
- Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpus☆14Updated last year
- Repository of the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI)☆17Updated 5 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆25Updated 2 years ago
- A bibliographic reference correction service☆17Updated last year
- ☆49Updated 2 months ago
- TextGraphs + LLMs + graph ML for entity extraction, linking, ranking, and constructing a lemma graph☆20Updated 8 months ago
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆43Updated 5 months ago
- Metadata Extractor & Loader (MEL) ■ The NLP-NER Toolkit (TNNT)☆22Updated last year
- XAI based human-in-the-loop framework for automatic rule-learning.☆47Updated 4 months ago
- Python based Wikidata framework for easy dataframe extraction☆39Updated 11 months ago
- A pipeline using LLMs for Knowledge Engineering, combining knowledge probing and Wikidata entity mapping.☆34Updated last year
- This is the repo for the container that holds the models for the text2vec-transformers module☆40Updated last week
- Open Access PDF harvester, metadata aggregator and full-text ingester☆54Updated 6 months ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆23Updated 2 years ago
- ☆18Updated 3 years ago
- The OpenCitations metadata model: documents and other material.☆12Updated 3 months ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated last year
- spaCy entry points for Curated Transformers☆24Updated last month
- A simple library for training named entity recognition model from partially annotated data☆21Updated last year
- ☆15Updated 2 years ago
- A browser extension providing Open Access bibliographical services☆14Updated last year
- Open Access PDF harvester☆35Updated 6 months ago
- Vespa application making an index of the CORD-19 dataset.☆39Updated 2 months ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago