Do Multilingual Language Models Think Better in English?
☆42Aug 3, 2023Updated 2 years ago
Alternatives and similar repositories for self-translate
Users that are interested in self-translate are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets.☆15Jul 10, 2023Updated 2 years ago
- Curriculum training☆22Jun 25, 2025Updated 11 months ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆81Apr 11, 2024Updated 2 years ago
- ☆21Dec 5, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆12Jan 2, 2024Updated 2 years ago
- A extension of Transformers library to include T5ForSequenceClassification class.☆40Apr 17, 2023Updated 3 years ago
- ☆13Jun 16, 2021Updated 4 years ago
- A Test Collection of Computer Science Papers for Faceted Query by Example☆23Nov 28, 2021Updated 4 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆38Aug 29, 2025Updated 9 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆27Nov 25, 2024Updated last year
- ☆16May 14, 2024Updated 2 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆58Feb 3, 2026Updated 4 months ago
- Data and code: "Answering legal questions from laymen in German civil law system", Büttner & Habernal, EACL'24☆15Mar 2, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- DSTC9 Submission☆16Apr 12, 2021Updated 5 years ago
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Oct 20, 2022Updated 3 years ago
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Sep 8, 2022Updated 3 years ago
- ACL 2021 paper "Style is NOT a single variable: Case Studies for Cross-Style Language Understanding " by Dongyeop Kang and Eduard Hovy☆15Jul 19, 2021Updated 4 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Dec 12, 2024Updated last year
- ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost☆42Nov 15, 2023Updated 2 years ago
- Tool to perform paired evaluation of automatic systems☆13Oct 20, 2021Updated 4 years ago
- Named entity recognition for the legal domain☆43Jun 1, 2021Updated 5 years ago
- Code for ECIR 2022 paper Local Citation Recommendation with Hierarchical-Attention Text Encoder and SciBERT-based Reranking☆25Jul 30, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official Implementation for Seq2seq is All You Need For Coreference Resolution Paper☆16Dec 1, 2023Updated 2 years ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)☆19Dec 8, 2023Updated 2 years ago
- ☆22Sep 19, 2023Updated 2 years ago
- This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paper☆14Aug 9, 2021Updated 4 years ago
- State-of-the-art LLM-based translation models.☆586Apr 9, 2025Updated last year
- Converting PDF files to text, mainly with a focus on arXiv papers.☆24Feb 19, 2024Updated 2 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆28Oct 3, 2021Updated 4 years ago
- ☆25Oct 22, 2022Updated 3 years ago
- A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark☆32Apr 15, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆19Jul 22, 2019Updated 6 years ago
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Mar 8, 2023Updated 3 years ago
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆97Aug 15, 2023Updated 2 years ago
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆306Updated this week
- ☆12Jun 3, 2023Updated 3 years ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆34Aug 9, 2023Updated 2 years ago
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago