Genaios / TextMachina
A modular and extensible Python framework, designed to aid in the creation of high-quality, unbiased datasets to build robust models for MGT-related tasks such as detection, attribution, and boundary detection.
☆15Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for TextMachina
- Source code and data for Like a Good Nearest Neighbor☆28Updated 9 months ago
- triple-encoders is a library for contextualizing distributed Sentence Transformers representations.☆14Updated 2 months ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated 3 weeks ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated 9 months ago
- Repository for "Attribute First, then Generate: Locally-attributable Grounded Text Generation", ACL 2024☆27Updated 7 months ago
- RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!☆41Updated last year
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆38Updated 6 months ago
- Code for equipping pretrained language models (BART, GPT-2, XLNet) with commonsense knowledge for generating implicit knowledge statement…☆16Updated 3 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆31Updated 3 years ago
- Finding semantically meaningful and accurate prompts.☆46Updated last year
- ☆12Updated 3 months ago
- Models for automatically transforming toxic text to neutral☆33Updated last year
- Data and info for the paper "ParaDetox: Text Detoxification with Parallel Data"☆27Updated 2 weeks ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated last year
- https://footprints.baulab.info☆12Updated last month
- Scripts supporting the development and serving the Roots Search Tool - https://hf.co/spaces/bigscience-data/roots-search☆10Updated last year
- Targeted Data Generation with Large Language Models☆14Updated 4 months ago
- Hugging Face and Pyserini interoperability☆19Updated last year
- Corpus exploration platform using advanced tools such as interactive summarization and multi document coreference resolution☆11Updated last year
- ☆15Updated 3 months ago
- Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"☆19Updated last year
- ☆25Updated last year
- [ACL 2023] Few-shot Reranking for Multi-hop QA via Language Model Prompting☆26Updated last year
- ☆50Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Updated last year
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆22Updated 2 years ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆46Updated 2 years ago
- ☆33Updated last year