facebookresearch / side
The AI Knowledge Editor
β183Updated 2 years ago
Alternatives and similar repositories for side:
Users that are interested in side are comparing it to the libraries listed below
- β182Updated last year
- Pipeline for pulling and processing online language model pretraining data from the webβ175Updated last year
- π« SpaCy wrapper for ConceptNet π«β90Updated last year
- An instruction-based benchmark for text improvements.β141Updated 2 years ago
- Find and fix bugs in natural language machine learning models using adaptive testing.β182Updated 10 months ago
- πΈ fastText + Bloom embeddings for compact, full-coverage vectors with spaCyβ309Updated last year
- Web-scale retrieval for knowledge-intensive NLPβ552Updated 2 years ago
- Pretraining Efficiently on S2ORC!β156Updated 4 months ago
- A library to synthesize text datasets using Large Language Models (LLM)β151Updated 2 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' puβ¦β40Updated 3 years ago
- Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.β31Updated last year
- β204Updated 2 weeks ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β104Updated 9 months ago
- Open source library for few shot NLPβ77Updated last year
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataβ¦β86Updated last year
- β77Updated 2 years ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further langβ¦β121Updated 10 months ago
- Information extraction from English and German texts based on predicate logicβ135Updated last year
- π€ Disaggregators: Curated data labelers for in-depth analysis.β65Updated 2 years ago
- Used for adaptive human in the loop evaluation of language and embedding models.β306Updated 2 years ago
- Code for constructing TLDR corpus from Reddit datasetβ27Updated 3 years ago
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Taggingβ66Updated 2 years ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β60Updated last year
- RaKUn 2.0 - A fast keyword detection algorithmβ66Updated 3 weeks ago
- β91Updated 9 months ago
- A Python library aimed at dissecting and augmenting NER training data.β58Updated last year
- Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale, TACL (2022)β123Updated 4 months ago
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 qβ¦β86Updated last year
- A diff tool for language modelsβ42Updated last year
- Google's BigBird (Jax/Flax & PyTorch) @ π€Transformersβ48Updated last year