facebookresearch / side
The AI Knowledge Editor
☆182Updated 2 years ago
Alternatives and similar repositories for side:
Users that are interested in side are comparing it to the libraries listed below
- Pretraining Efficiently on S2ORC!☆160Updated 5 months ago
- ☆182Updated last year
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Data…☆87Updated 2 years ago
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network☆287Updated 6 months ago
- Repo for the paper "Detecting Logical Fallacies: From Quiz to Climate Change News" (2021)☆75Updated last year
- multimodal document analysis☆164Updated 10 months ago
- The pipeline for the OSCAR corpus☆168Updated last year
- ☆87Updated 11 months ago
- ☆93Updated 10 months ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆51Updated last year
- Pipeline for pulling and processing online language model pretraining data from the web☆177Updated last year
- 💫 SpaCy wrapper for ConceptNet 💫☆92Updated last year
- Find and fix bugs in natural language machine learning models using adaptive testing.☆183Updated 11 months ago
- Question-answers, collected from Google☆129Updated 3 years ago
- Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.☆31Updated last year
- Search Engines with Autoregressive Language models☆284Updated 2 years ago
- Web-scale retrieval for knowledge-intensive NLP☆552Updated 2 years ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆126Updated last year
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆113Updated 2 years ago
- A library for finding knowledge neurons in pretrained transformer models.☆155Updated 3 years ago
- ☆221Updated last year
- Used for adaptive human in the loop evaluation of language and embedding models.☆308Updated 2 years ago
- ☆189Updated 11 months ago
- MultiCite code and data. Models are available on Huggingface.☆31Updated 2 years ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆67Updated 2 years ago
- A library to synthesize text datasets using Large Language Models (LLM)☆152Updated 2 years ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆178Updated 3 months ago
- ☆208Updated last month
- An instruction-based benchmark for text improvements.☆141Updated 2 years ago
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆201Updated last year