Multilingual sentence alignment using sentence embeddings
☆149Nov 4, 2024Updated last year
Alternatives and similar repositories for bertalign
Users that are interested in bertalign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Improved Sentence Alignment in Linear Time and Space☆194Mar 6, 2023Updated 3 years ago
- ☆37Mar 16, 2026Updated last month
- Machine-Translation-based sentence alignment tool for parallel text☆314Mar 18, 2021Updated 5 years ago
- A corpus of short answers written by learners of English and graded with CEFR levels☆12Dec 17, 2021Updated 4 years ago
- Open-source, AI-enhanced CAT tool with multi-LLM support, translation memory, glossary management, Superbench translation quality benchma…☆32Updated this week
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆24Oct 27, 2023Updated 2 years ago
- Sentence aligner☆127May 21, 2021Updated 4 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆38Aug 29, 2025Updated 8 months ago
- [EMNLP 2020] Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆394Nov 7, 2023Updated 2 years ago
- Bicleaner fork that uses neural networks☆40Feb 23, 2026Updated 2 months ago
- ☆11Aug 7, 2022Updated 3 years ago
- Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …☆84Sep 21, 2023Updated 2 years ago
- State-of-the-art LLM-based translation models.☆584Apr 9, 2025Updated last year
- ☆26Jul 30, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Improving cross-lingual word embeddings by meeting in the middle☆23Aug 25, 2020Updated 5 years ago
- Adaptive Machine Translation with Large Language Models☆31Jan 4, 2025Updated last year
- Automatic Idiomatic Expression Detection☆13Sep 26, 2021Updated 4 years ago
- Linguistically analyzed Classical Tibetan texts☆29Jun 30, 2021Updated 4 years ago
- Automated listing of repos in GitHub with XML files containing teiHeader. Find a project using TEI today!☆17Updated this week
- Transcription corpora for training HTR models for medieval manuscripts from the 12th to the 15th century.☆25Jan 17, 2025Updated last year
- Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning☆19Nov 3, 2022Updated 3 years ago
- Bias correction for richness in abundance data☆12Apr 20, 2026Updated 2 weeks ago
- Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.☆13Nov 17, 2020Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Reasoning-based Evaluation and Ranking of Translations.☆20Jul 18, 2025Updated 9 months ago
- official code for "EgoVSR: Towards High-Quality Egocentric Video Super-Resolution"☆15Jul 26, 2023Updated 2 years ago
- Digital texts in Prakrit☆10Sep 14, 2025Updated 7 months ago
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Sep 15, 2021Updated 4 years ago
- TER-plus Machine Translation metric.☆31May 23, 2022Updated 3 years ago
- Text Re-use Alignment Visualization☆38Nov 8, 2017Updated 8 years ago
- This repository will soon contain all scripts and links to the annotated corpora of Tibetan.☆14Feb 4, 2025Updated last year
- pyTorch-text-classification☆16Dec 6, 2017Updated 8 years ago
- German Text Embedding Clustering Benchmark☆18Mar 15, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆22May 24, 2023Updated 2 years ago
- Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-…☆36Jan 23, 2025Updated last year
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆24Dec 20, 2022Updated 3 years ago
- Evaluation of Sentence Representations in Polish☆23Dec 29, 2022Updated 3 years ago
- data, metadata, tools, and LDA experiments on a corpus of Sanskrit philosophy texts☆12Nov 28, 2021Updated 4 years ago
- Official repository for U-SAM (Interspeech 2025)☆27Jun 3, 2025Updated 11 months ago