sufenlp/AccAlign

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sufenlp/AccAlign)

sufenlp / AccAlign

A accurate multilingual word aligner based on LaBSE

☆24

Alternatives and similar repositories for AccAlign

Users that are interested in AccAlign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HKUNLP / multilingual-transfer
View on GitHub
Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“
☆15Jun 13, 2023Updated 3 years ago
liuquncn / NanGeMT
View on GitHub
NanGe - A Rule-based Chinese-English Machine Translation System
☆20Jul 23, 2017Updated 9 years ago
LAGoM-NLP / transtokenizer
View on GitHub
☆57Dec 27, 2025Updated 6 months ago
G-Research / fast-string-search
View on GitHub
☆13Apr 13, 2021Updated 5 years ago
neulab / awesome-align
View on GitHub
A neural word aligner based on multilingual BERT
☆379Mar 10, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
blester125 / string-distance
View on GitHub
String Distance using cython
☆13Jan 19, 2020Updated 6 years ago
nttcslab-nlp / spanalign
View on GitHub
SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP
☆15Mar 24, 2021Updated 5 years ago
bfsujason / bertalign
View on GitHub
Multilingual sentence alignment using sentence embeddings
☆157May 4, 2026Updated 2 months ago
tigerchen52 / GLADIS
View on GitHub
GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)
☆18Jun 24, 2024Updated 2 years ago
henchc / Rediscovering-Text-as-Data
View on GitHub
L&S 88-5 Connector Course to Data 8
☆15Apr 12, 2018Updated 8 years ago
koaning / sentence-models
View on GitHub
A different, but useful, textcat approach.
☆18Jul 15, 2024Updated 2 years ago
zouharvi / pearmut
View on GitHub
Platform for Evaluating and Reviewing of Multilingual Tasks
☆32Updated this week
huhailinguist / ChineseNLIProbing
View on GitHub
☆10Oct 17, 2021Updated 4 years ago
noe-eva / NOAH-Corpus
View on GitHub
NOAH's Corpus: Part-of-Speech Tagging for Swiss German
☆12Jan 6, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
cisnlp / MEXA
View on GitHub
[ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
☆11Apr 6, 2025Updated last year
xnliang98 / CKE-ZH
View on GitHub
基于中心度的中文关键短语抽取工具
☆11Sep 2, 2022Updated 3 years ago
ahmedssabir / Belief-Revision-Score
View on GitHub
Belief Revision based Caption Re-ranker with Visual Semantic Information. COLING 2022
☆11Apr 13, 2025Updated last year
Language-Media-Lab / commonsense-moral-ja
View on GitHub
☆15Nov 20, 2025Updated 8 months ago
berlino / btg-seq2seq
View on GitHub
☆12Dec 13, 2022Updated 3 years ago
mentat-collective / Mafs.cljs
View on GitHub
Reagent interface to the Mafs interactive 2d math visualization library.
☆15Jun 1, 2024Updated 2 years ago
lanjiuqing64 / KGdata
View on GitHub
KG data for ODA
☆12May 14, 2026Updated 2 months ago
Unbabel / smaug
View on GitHub
Python package to augment multilingual data
☆15Feb 15, 2023Updated 3 years ago
UNHSAILLab / TaCo
View on GitHub
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
☆14Jul 1, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Liebeck / IWNLP-py
View on GitHub
Python port for IWNLP.Lemmatizer
☆19Apr 13, 2026Updated 3 months ago
PRImA-Research-Lab / prima-page-to-pdf
View on GitHub
Java command line tool to convert PAGE XML files with layout and text content to PDF
☆10Apr 27, 2020Updated 6 years ago
ku-nlp / text-cleaning
View on GitHub
A powerful text cleaner for Japanese web texts
☆12Jan 20, 2024Updated 2 years ago
Cyb3r-Jak3 / docker-workerd
View on GitHub
Docker image for Cloudflare workerd
☆15Feb 11, 2023Updated 3 years ago
Rojak-NLP / LLM-Code-Mixing
View on GitHub
Can LLMs generate code-mixed sentences through zero-shot prompting?
☆11Apr 18, 2023Updated 3 years ago
lilt / alignment-scripts
View on GitHub
Scripts to preprocess training and test data and to run fast_align and giza
☆107Nov 2, 2021Updated 4 years ago
yoichi1484 / subspace
View on GitHub
An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)
☆10May 31, 2024Updated 2 years ago
vadimkantorov / inferspeech
View on GitHub
PyTorch speech2text inference script for the NVidia openseq2seq wav2letter model variant
☆10Aug 12, 2019Updated 6 years ago
dtuggener / CorZu
View on GitHub
Coreference resolution for German
☆16Jun 26, 2017Updated 9 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
cisnlp / simalign
View on GitHub
[EMNLP 2020] Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
☆398Nov 7, 2023Updated 2 years ago
dbamman / anlp23
View on GitHub
Data and code to support "Applied Natural Language Processing" (INFO 256, Fall 2023, UC Berkeley)
☆17Nov 20, 2023Updated 2 years ago
gotutiyan / gec-metrics
View on GitHub
A library for evaluation of Grammatical Error Correction (GEC). Accepted to ACL'25 Demo: "gec-metrics: A Unified Library for Grammatical …
☆14Jan 25, 2026Updated 5 months ago
totakke / libra
View on GitHub
Benchmarking framework for Clojure
☆10Feb 27, 2019Updated 7 years ago
ssokota / mec
View on GitHub
Code for minimum-entropy coupling.
☆33Jan 6, 2026Updated 6 months ago
alvations / NTU-MC
View on GitHub
Nanyang Technological University - Multilingual Corpus (STB subcorpora)
☆12Mar 11, 2019Updated 7 years ago
SapienzaNLP / ita-bench
View on GitHub
A collection of Italian benchmarks for LLM evaluation
☆37Jun 9, 2026Updated last month