commoncrawl / cc-citations
Scientific articles using or citing Common Crawl data
☆12Updated this week
Alternatives and similar repositories for cc-citations:
Users that are interested in cc-citations are comparing it to the libraries listed below
- ☆29Updated 10 years ago
- A bibliographic reference correction service☆18Updated 2 years ago
- wrapper for the crossref events api☆17Updated last year
- This repository contains different algorithms that are used to build taxonomy from text corpus.☆8Updated 3 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 2 years ago
- Integration between Reaction ECommerce and Accelerated Text to provide product descriptions for an e-shop.☆9Updated 3 years ago
- Code for the JCDL 2023 paper CitePrompt: Using Prompts to Identify Citation Intent in Scientific Papers☆9Updated last year
- Transforming textual descriptions into process models using deep learning☆13Updated 5 years ago
- ☆14Updated last year
- Tools to construct and process webgraphs from Common Crawl data☆84Updated 3 weeks ago
- The OpenCitations RDF Resource Browser☆11Updated this week
- Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.☆22Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆31Updated 7 months ago
- A News Article Collection Library☆22Updated last year
- A simple library for training named entity recognition model from partially annotated data☆22Updated last year
- This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using …☆17Updated 3 years ago
- Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpus☆14Updated last year
- ☆19Updated 6 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago
- Implementation for EACL 2021 paper "Scientific Discourse Tagging for Evidence Extraction".☆20Updated 3 years ago
- Discourse Analysis Tool Suite☆18Updated this week
- Repository of the HBCP project.☆19Updated 5 months ago
- This is the backend layer of SearchX. SearchX is a scalable collaborative search system being developed by Lambda Lab of TU Delft.☆11Updated 2 years ago
- 🌸 Train floret vectors☆18Updated last year
- spaCy entry points for Curated Transformers☆26Updated 3 months ago
- ☆30Updated 2 years ago
- Linking Entities in CommonCrawl Dataset onto Wikipedia Concepts☆59Updated 12 years ago
- The OpenCitations metadata model: documents and other material.☆12Updated 5 months ago
- The News Landscape Toolkit (NELA)☆15Updated 4 years ago
- Solve Geometric & Graph Problems with Large Language Models☆28Updated last year