Multilingual sentence alignment using sentence embeddings
☆151May 4, 2026Updated 3 weeks ago
Alternatives and similar repositories for bertalign
Users that are interested in bertalign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Improved Sentence Alignment in Linear Time and Space☆194Mar 6, 2023Updated 3 years ago
- ☆38Mar 16, 2026Updated 2 months ago
- A 2024 Reading List for Bilingual Lexicon Induction (BLI) / Word Translation. Frequently Updated.☆23Sep 29, 2024Updated last year
- A corpus of short answers written by learners of English and graded with CEFR levels☆12Dec 17, 2021Updated 4 years ago
- Code for our paper in ACL 2017☆13Dec 14, 2017Updated 8 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A accurate multilingual word aligner based on LaBSE☆24Oct 25, 2023Updated 2 years ago
- Open-source, AI-enhanced CAT tool with multi-LLM support, translation memory, glossary management, 'Superlookup' concordance across TMs/g…☆35May 16, 2026Updated last week
- Sentence aligner☆127May 21, 2021Updated 5 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆38Aug 29, 2025Updated 8 months ago
- [EMNLP 2020] Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆395Nov 7, 2023Updated 2 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …☆84Sep 21, 2023Updated 2 years ago
- ☆26Jul 30, 2024Updated last year
- An implementation of data augmentation methods for natural language processing tasks.☆13Jul 25, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Adaptive Machine Translation with Large Language Models☆31Jan 4, 2025Updated last year
- Public Issue Only Repository for Issues and Distrobution of Cohmetrix Core Desktop☆10Nov 18, 2024Updated last year
- Automatic Idiomatic Expression Detection☆13Sep 26, 2021Updated 4 years ago
- Extract Chinese/English QA Data from WikiHow pages.☆16May 21, 2023Updated 3 years ago
- Linguistically analyzed Classical Tibetan texts☆29Jun 30, 2021Updated 4 years ago
- Neural CRF Model for Sentence Alignment in Text Simplification☆67Jan 19, 2025Updated last year
- An open source Translation Memory Engine written in Java☆16Dec 22, 2022Updated 3 years ago
- Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.☆13Nov 17, 2020Updated 5 years ago
- Reasoning-based Evaluation and Ranking of Translations.☆20Jul 18, 2025Updated 10 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- official code for "EgoVSR: Towards High-Quality Egocentric Video Super-Resolution"☆15Jul 26, 2023Updated 2 years ago
- We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scal…☆81Aug 31, 2021Updated 4 years ago
- Neural Network approaches for the Traveling Salesman Problem☆10Apr 20, 2021Updated 5 years ago
- Re-rank n-best lists using additional features.☆29Jun 5, 2018Updated 7 years ago
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Sep 15, 2021Updated 4 years ago
- TER-plus Machine Translation metric.☆31May 23, 2022Updated 4 years ago
- ☆25May 27, 2021Updated 4 years ago
- ☆25May 11, 2024Updated 2 years ago
- This repository will soon contain all scripts and links to the annotated corpora of Tibetan.☆14Feb 4, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- pyTorch-text-classification☆16Dec 6, 2017Updated 8 years ago
- Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation…☆32Jan 14, 2025Updated last year
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆22May 24, 2023Updated 3 years ago
- Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-…☆36Jan 23, 2025Updated last year
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆11Oct 25, 2023Updated 2 years ago
- AMR-Visualization Tools, show AMR graph strcucture☆12Jul 29, 2019Updated 6 years ago
- Evaluation of Sentence Representations in Polish☆23Dec 29, 2022Updated 3 years ago