ymoslem / Adaptive-MT-LLM-Fine-tuning
Fine-tuning Open-Source LLMs for Adaptive Machine Translation
☆73Updated last week
Alternatives and similar repositories for Adaptive-MT-LLM-Fine-tuning:
Users that are interested in Adaptive-MT-LLM-Fine-tuning are comparing it to the libraries listed below
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆99Updated 10 months ago
- The FLORES+ Machine Translation Benchmark☆100Updated 3 months ago
- ☆239Updated 8 months ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆92Updated last year
- NTREX -- News Test References for MT Evaluation☆81Updated 8 months ago
- A Multilingual Replicable Instruction-Following Model☆94Updated last year
- Multilingual Large Language Models Evaluation Benchmark☆117Updated 6 months ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆67Updated 11 months ago
- A library of translation-based text similarity measures☆25Updated last year
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆116Updated 2 months ago
- GEMBA — GPT Estimation Metric Based Assessment☆108Updated 6 months ago
- Adaptive Machine Translation with Large Language Models☆30Updated last month
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆98Updated 2 months ago
- ☆17Updated 2 years ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆55Updated 6 months ago
- MAFAND-MT☆55Updated 7 months ago
- Efficient Attention for Long Sequence Processing☆92Updated last year
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆29Updated 3 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆70Updated 11 months ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆59Updated last week
- cLang-8 is a dataset for grammatical error correction.☆103Updated 2 years ago
- Library for pruning experts per language pair in NLLB-200☆32Updated last year
- Yet Another Neural Machine Translation Toolkit☆177Updated 7 months ago
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆29Updated 4 months ago
- Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …☆75Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆67Updated 4 months ago
- Multilingual sentence alignment using sentence embeddings☆108Updated 3 months ago
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆21Updated 2 years ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated 3 weeks ago
- This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalenc…☆53Updated 6 months ago