Find duplicate text files.
☆14Jan 14, 2025Updated last year
Alternatives and similar repositories for dedup
Users that are interested in dedup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A merged read deduplication tool capable to perform merged read deduplication on single end data.☆13Sep 4, 2024Updated last year
- Get a list of deduped files on a ZFS filesystem☆13Oct 14, 2020Updated 5 years ago
- Code for extracting parallel corpora from pmindia☆17Jan 28, 2020Updated 6 years ago
- Find near-duplicate documents using minhashing implemented in Go.☆16Dec 22, 2015Updated 10 years ago
- Microsoft Translator Api wrapper☆12Feb 12, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Rabin hashing and content-defined chunking for Go☆20Sep 11, 2017Updated 8 years ago
- A Windows program to view/examine XLIFF file contents.☆14Sep 26, 2024Updated last year
- Python library and dashboard for hyperparameter search and model training for computer vision tasks based on PyTorch, Optuna, FiftyOne, D…☆17Jul 14, 2023Updated 2 years ago
- ✨ Epris is a JavaScript library that simplifies interface development☆26May 30, 2022Updated 3 years ago
- Cross-browser wrapper for window.console object.☆14Apr 14, 2015Updated 11 years ago
- Batch for Excel 2013 SPREADSHEETCOMPARE tool☆14Jun 25, 2014Updated 11 years ago
- PyTorch - Albert Large V2, Bert Base Uncased, Bert Large Uncased WWM Finetuned Squad, Distil Roberta Base, Roberta Base Squad2, Roberta l…☆11Jul 10, 2020Updated 5 years ago
- ☆13Apr 16, 2022Updated 4 years ago
- ☆10Nov 22, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Documentation for MX Linux☆12Jan 17, 2026Updated 4 months ago
- Javascript free captcha written in PHP☆11Mar 17, 2022Updated 4 years ago
- An OSINT tool to find data leaks on a targeted website☆18Mar 30, 2021Updated 5 years ago
- The Resource Static Analysis enables companies and localization suppliers to quickly add scalable validation checks to help ensure qualit…☆18Nov 28, 2022Updated 3 years ago
- ☆11Aug 25, 2021Updated 4 years ago
- ☆20Apr 24, 2023Updated 3 years ago
- My OpenCode and Oh-My-OpenCode configuration files with API proxy setup documentation☆36Jan 5, 2026Updated 4 months ago
- Linux Programming Interface Kerrisk☆12Jan 11, 2019Updated 7 years ago
- A Hindi Image Captioning system made completely with Transformers🤗☆10Apr 16, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- PyTorch Implementation of the paper - 'Generative Adversarial Text to Image Synthesis' from ICML 2016 https://arxiv.org/abs/1605.05396☆10May 23, 2021Updated 5 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Jun 21, 2022Updated 3 years ago
- A cute POP3 server in PHP. Using libevent for Non blocking, event driven I/O☆16Oct 29, 2014Updated 11 years ago
- Repository for our paper "AbuseAnalyzer: Abuse Detection, Severity and Target Prediction for Gab Posts"☆11Jul 18, 2021Updated 4 years ago
- Reading list for multimodal sequence learning☆14Sep 4, 2023Updated 2 years ago
- A minimal Bash framework and CLI tool that makes writing, sharing and using bash scripts easy☆12Apr 23, 2026Updated last month
- Dataset for Paper "Exploring Content Selection in Summarization of Novel Chapters"☆14Mar 20, 2023Updated 3 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆15Aug 27, 2024Updated last year
- ☆10Oct 15, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆10Dec 3, 2020Updated 5 years ago
- Question generation from text☆15Sep 19, 2012Updated 13 years ago
- Super simple, zero config options, <2kb declarative tooltip library with no dependencies.☆17Jun 2, 2023Updated 2 years ago
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆14Jun 6, 2023Updated 2 years ago
- Repository for our paper “Subverting the Jewtocracy”: Online Antisemitism Detection Using MultimodalDeep Learning☆12Apr 29, 2022Updated 4 years ago
- Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages☆11Jan 1, 2023Updated 3 years ago
- Miscellaneous R functions and aliases☆10Mar 30, 2026Updated last month