paracrawl/corset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/paracrawl/corset)

paracrawl / corset

Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

☆21

Alternatives and similar repositories for corset

Users that are interested in corset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mbanon / fastspell
View on GitHub
Targetted language identifier, based on FastText and Hunspell.
☆38Sep 4, 2025Updated 10 months ago
poleval / 2021-punctuation-restoration
View on GitHub
PolEval 2021 Task 1
☆15Jun 28, 2022Updated 4 years ago
summa-platform / summa-oss
View on GitHub
Meta-repository for the open-source version of the SUMMA Platform
☆16Mar 25, 2024Updated 2 years ago
aoliverg / TBXTools
View on GitHub
☆14Apr 13, 2026Updated 3 months ago
Roxot / AEVNMT
View on GitHub
Auto-Encoding Variational Neural Machine Translation
☆16Jan 22, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
myagues / flax_nerf
View on GitHub
Unofficial implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, using Flax with the Linen API
☆13Sep 25, 2021Updated 4 years ago
hplt-project / OpusCleaner
View on GitHub
OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.
☆58Feb 3, 2026Updated 5 months ago
ContinuumIO / dl-tutorial-2020-10
View on GitHub
Deep Learning Tutorial taught on October 16, 2020
☆13Jun 25, 2026Updated last month
google / wmt19-paraphrased-references
View on GitHub
☆15Nov 5, 2020Updated 5 years ago
psankar / korkai
View on GitHub
A corpus builder for Tamil by analyzing wordpress, blogger, wikipedia dumps
☆20Jul 12, 2020Updated 6 years ago
amittai / cynical
View on GitHub
Cynical data selection
☆20Jan 16, 2021Updated 5 years ago
tushar2708 / conveyor
View on GitHub
A go pipeline management library, supporting concurrent pipelines, with multiple nodes and joints
☆15Jul 3, 2026Updated 3 weeks ago
elimisteve / v2go
View on GitHub
V-to-Go translator
☆12Jul 21, 2019Updated 7 years ago
Helsinki-NLP / OpusTools
View on GitHub
☆83Jun 24, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Prompsit / mutnmt
View on GitHub
An educational tool to train, inspect, evaluate and translate using neural engines
☆20Mar 13, 2025Updated last year
katopz / awesome-wasm
View on GitHub
WebAssembly FTW
☆11Feb 28, 2023Updated 3 years ago
NICTA / iris-reasoner
View on GitHub
Clone of iris-reasoner (http://iris-reasoner.org) from sourceforge
☆11Mar 18, 2016Updated 10 years ago
roy-ht / pyter
View on GitHub
☆27Jan 7, 2017Updated 9 years ago
rwth-i6 / sisyphus
View on GitHub
A Workflow Manager in Python
☆50Updated this week
ngeor / rusty-basic
View on GitHub
An interpreter for QBasic, written in Rust.
☆11Feb 28, 2026Updated 4 months ago
skmda37 / CartoonX
View on GitHub
CartoonX is a saliency map method for image classifiers operating in the wavelet/shearlet domain.
☆10Feb 23, 2026Updated 5 months ago
ec-jrc / Patents4IPPC
View on GitHub
☆28Mar 3, 2023Updated 3 years ago
DieracDelta / DAWN
View on GitHub
DAWN (Debug Adapter with Nix)
☆17Jan 1, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
bbonamin / strftimeslikethese
View on GitHub
A Ruby WASM implementation, fully client-side, inspired by foragoodstrftime.com
☆14Nov 13, 2022Updated 3 years ago
graboluk / stiko
View on GitHub
systray icon for syncthing
☆15Aug 10, 2018Updated 7 years ago
wmutschl / mutschler.dev
View on GitHub
Tech blog
☆12Jan 14, 2026Updated 6 months ago
hercules-ci / canonix
View on GitHub
Experiment in Nix formatting
☆23Oct 4, 2019Updated 6 years ago
bitextor / bicleaner
View on GitHub
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
☆160Jun 18, 2024Updated 2 years ago
xemul / mosaic
View on GitHub
Mosaic trees managment tool and library
☆14Nov 17, 2015Updated 10 years ago
luooooob / create-my-awesome
View on GitHub
Github Actions for automatically generating the personal awesome list from all of the repositories you starred.
☆16Mar 6, 2023Updated 3 years ago
weijia-xu / fairseq-editor
View on GitHub
EDITOR: an Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints
☆29Dec 21, 2021Updated 4 years ago
marian-nmt / marian-examples
View on GitHub
Examples, tutorials and use cases for Marian, including our WMT-2017/18 baselines.
☆81Apr 8, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
duyichao / NPDA-KNN-ST
View on GitHub
Official implementation of EMNLP'2022 paper "Non-Parametric Domain Adaptation for End-to-End Speech Translation"
☆11Oct 26, 2022Updated 3 years ago
SquareBracketAssociates / Booklet-Smacc
View on GitHub
A booklet on the Smacc compiler compiler framework
☆15Jun 6, 2026Updated last month
toxvox / BentoPianoRollEditor
View on GitHub
☆10Apr 22, 2022Updated 4 years ago
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
SuljicAmar / Regress.me
View on GitHub
☆14Aug 14, 2022Updated 3 years ago
shirriff / GBC-audio-chip
View on GitHub
Reverse-engineered schematics for the IR3R53 audio amplifier chip used in the Game Boy Color
☆14May 30, 2020Updated 6 years ago
rbawden / mt-bigscience
View on GitHub
Evaluation results for Machine Translation within the BigScience project
☆11May 15, 2023Updated 3 years ago