ymoslem / MT-LMLinks
Domain-Specific Text Generation for Machine Translation (with LLMs) - scripts and config files for the paper
☆17Updated last year
Alternatives and similar repositories for MT-LM
Users that are interested in MT-LM are comparing it to the libraries listed below
Sorting:
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆129Updated 10 months ago
- Tools for managing datasets for governance and training.☆85Updated 2 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆59Updated last year
- Open information and community for machine translation☆79Updated last week
- ☆217Updated last week
- A question-answering dataset with a focus on subjective information☆45Updated last year
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated 2 years ago
- Open source library for few shot NLP☆79Updated 2 years ago
- Adaptive Machine Translation with Large Language Models☆30Updated 7 months ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆69Updated 2 years ago
- The FLORES+ Machine Translation Benchmark☆106Updated 9 months ago
- Abstractive and Extractive Text summarization using Transformers.☆85Updated 2 years ago
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆30Updated 4 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Updated 2 years ago
- ParaNames: A multilingual resource for parallel names☆35Updated last year
- Efficient few-shot learning with cross-encoders.☆56Updated last year
- A Multilingual Replicable Instruction-Following Model☆94Updated 2 years ago
- OpusFilter - Parallel corpus processing toolkit☆109Updated last week
- Template Extraction from unstructured Wikipedia text using NLP techniques.☆41Updated 5 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆147Updated 2 months ago
- RaKUn 2.0 - A fast keyword detection algorithm☆68Updated last week
- The pipeline for the OSCAR corpus☆171Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆88Updated last year
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆73Updated last year
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆103Updated last year
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Updated last year
- MAFAND-MT☆57Updated last year
- A library of translation-based text similarity measures☆25Updated last year