facebookresearch / side
The AI Knowledge Editor
β182Updated 2 years ago
Related projects β
Alternatives and complementary repositories for side
- β179Updated last year
- π« SpaCy wrapper for ConceptNet π«β88Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β103Updated 6 months ago
- β75Updated last year
- β82Updated 6 months ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β60Updated last year
- Pretraining Efficiently on S2ORC!β136Updated 3 weeks ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further langβ¦β117Updated 6 months ago
- multimodal document analysisβ160Updated 5 months ago
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataβ¦β84Updated last year
- Mining Legal Arguments in Court Decisions - Data and softwareβ64Updated last year
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.β50Updated last year
- A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation networkβ259Updated last month
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.β99Updated last year
- Tools for managing datasets for governance and training.β78Updated 3 weeks ago
- A diff tool for language modelsβ42Updated 10 months ago
- π§ͺ Cutting-edge experimental spaCy components and featuresβ95Updated 6 months ago
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLPβ20Updated 10 months ago
- Question-answers, collected from Googleβ124Updated 3 years ago
- A library to synthesize text datasets using Large Language Models (LLM)β151Updated last year
- A Python Commonsense Knowledge Inference Toolkitβ63Updated 11 months ago
- Information extraction from English and German texts based on predicate logicβ135Updated last year
- Pipeline for pulling and processing online language model pretraining data from the webβ174Updated last year
- Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: β¦β323Updated last year
- Repo to hold code and track issues for the collection of permissively licensed dataβ22Updated 2 weeks ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answersβ122Updated 8 months ago
- β86Updated 5 months ago
- Search Engines with Autoregressive Language modelsβ277Updated last year
- β185Updated 6 months ago
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 qβ¦β85Updated 8 months ago