Find duplicate text files.
☆14Jan 14, 2025Updated last year
Alternatives and similar repositories for dedup
Users that are interested in dedup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- String deduplication package for Go☆19Jan 10, 2024Updated 2 years ago
- ☆13Apr 16, 2022Updated 4 years ago
- ☆10Nov 22, 2023Updated 2 years ago
- An OSINT tool to find data leaks on a targeted website☆17Mar 30, 2021Updated 5 years ago
- ☆12Mar 31, 2020Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Experimental command suggestion system based on historical usage of commands in certain locations.☆12Feb 18, 2026Updated last month
- Bash script to create an ebook from a list of web articles. Inspired by the now-defunct Readlists.org by Readability☆18Oct 13, 2019Updated 6 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Feb 6, 2024Updated 2 years ago
- PyTorch Implementation of the paper - 'Generative Adversarial Text to Image Synthesis' from ICML 2016 https://arxiv.org/abs/1605.05396☆10May 23, 2021Updated 4 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Jun 21, 2022Updated 3 years ago
- Script to download all xkcd comics using web scraping.☆10Aug 27, 2021Updated 4 years ago
- Doing style transfer with linguistic features using OpenAI's CLIP.☆14May 4, 2021Updated 4 years ago
- Reading list for multimodal sequence learning☆14Sep 4, 2023Updated 2 years ago
- A minimal Bash framework and CLI tool that makes writing, sharing and using bash scripts easy☆12Apr 4, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Quick selection widget for Markdown notes, inspired by terminal_velocity☆13Jul 2, 2020Updated 5 years ago
- Dataset for Paper "Exploring Content Selection in Summarization of Novel Chapters"☆14Mar 20, 2023Updated 3 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆15Aug 27, 2024Updated last year
- Provides syntax highlighting for Apptainer/Singularity definition files.☆10Dec 24, 2025Updated 3 months ago
- ☆10Oct 15, 2020Updated 5 years ago
- Question generation from text☆15Sep 19, 2012Updated 13 years ago
- ARCHIVED A high-performance database of shipment-level CITES trade data☆12May 11, 2023Updated 2 years ago
- Transformer based Trigram Blocking implementation in Tensorflow☆11Feb 26, 2020Updated 6 years ago
- Super simple, zero config options, <2kb declarative tooltip library with no dependencies.☆17Jun 2, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆15Jun 6, 2023Updated 2 years ago
- Repository for our paper “Subverting the Jewtocracy”: Online Antisemitism Detection Using MultimodalDeep Learning☆12Apr 29, 2022Updated 3 years ago
- Semi-hand curated command-line option data for many CLI programs, geared toward bioinformatics tools. Shell completion scripts are also a…☆18Feb 24, 2026Updated last month
- ☆12Jan 19, 2026Updated 2 months ago
- ☆14Feb 24, 2021Updated 5 years ago
- A WordPress plugin to receive movie/series information, including poster and trailer from IMDB.☆10May 21, 2017Updated 8 years ago
- A tool to find all duplicates in large sets of text documents.☆16Sep 29, 2021Updated 4 years ago
- personal diary☆14Updated this week
- personal devel version of grid☆13Apr 29, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Recommendation System for Anime☆11Apr 15, 2024Updated 2 years ago
- A python library to generate highly realistic typos (fuzz-testing)☆13Mar 16, 2025Updated last year
- Repository for the Findings of ACL'23 paper Label Agnostic Pre-training for Zero-shot Text Classification☆12Aug 10, 2023Updated 2 years ago
- La plateforme derrière nous le peuple. Fork de Pligg.☆10Sep 29, 2015Updated 10 years ago
- A Pointer Generator with a BERT encoder☆10Aug 12, 2019Updated 6 years ago
- This package contains helpers to deal with physical variables and units.☆12Dec 22, 2022Updated 3 years ago
- XFCE4 GenMon plugin: Time tracking widget with TimeWarrior☆11Feb 4, 2024Updated 2 years ago