FastCDC implementation in Python https://pypi.org/project/fastcdc/
☆65Jun 27, 2024Updated 2 years ago
Alternatives and similar repositories for fastcdc-py
Users that are interested in fastcdc-py are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RapidCDC: Leveraging Duplicate Locality to Accelerate Chunking in CDC-based Deduplication Systems☆17May 25, 2020Updated 6 years ago
- Fast and efficient content-defined chunking for data deduplication. Java implementation of FastCDC as library.☆26Sep 21, 2023Updated 2 years ago
- An implementation of FastCDC in C☆36Jun 27, 2022Updated 4 years ago
- Some paper lists related to storage systems☆51Apr 20, 2026Updated 2 months ago
- Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with…☆75Mar 1, 2020Updated 6 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- An experimental platform for chunk-level data deduplication. Key words: DDFS, Sparse Index, Extreme Binning, SiLo, Sample Index, BLC; CBR…☆168Apr 17, 2016Updated 10 years ago
- Multiple ways of chunking for data deduplication: Fixed size chunking, Content defined chunking, and File based chunking.☆19Dec 20, 2013Updated 12 years ago
- Get a list of deduped files on a ZFS filesystem☆13Oct 14, 2020Updated 5 years ago
- ACM SoCC 2019, "Coupling Decentralized Key-Value Stores with Erasure Coding"☆15May 22, 2021Updated 5 years ago
- small fastcdc implementation in c99☆18Dec 31, 2022Updated 3 years ago
- [MSST '24] SAS-Cache: A Semantic-Aware Secondary Cache for LSM-based Key-Value Stores☆14Jun 3, 2024Updated 2 years ago
- A merged read deduplication tool capable to perform merged read deduplication on single end data.☆14Sep 4, 2024Updated last year
- Find near-duplicate documents using minhashing implemented in Go.☆16Dec 22, 2015Updated 10 years ago
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆18Aug 28, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆16Jun 11, 2023Updated 3 years ago
- A Golang package that implements CDC chunkers with a generic interface☆127Jun 14, 2026Updated 2 weeks ago
- Deduplication for cfDNA sequencing data☆11Jul 5, 2017Updated 8 years ago
- A Python tool to search for and remove duplicated files in messy datasets☆15Dec 23, 2024Updated last year
- Das Kochbuch, das Nerds das Kochen beibringt☆12Jan 14, 2014Updated 12 years ago
- Fast duplicate file detection library☆26Jan 5, 2017Updated 9 years ago
- Rabin fingerprinting and deduplication library in C☆28Feb 16, 2016Updated 10 years ago
- Original Joy☆11Dec 17, 2024Updated last year
- Firejail wrapper for Nim, Isolate your Production App before its too late!☆25Jun 6, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 文档去重功能是为了解决搜索引擎的文档语义重复的问题,方法是多重哈希下的语义指纹算法。☆11Aug 17, 2013Updated 12 years ago
- A simple example of how to use Leex and Yecc☆13Jan 19, 2016Updated 10 years ago
- ☆12Jan 12, 2024Updated 2 years ago
- Find duplicate text files.☆14Jan 14, 2025Updated last year
- Pile Deduplication Code☆18May 15, 2023Updated 3 years ago
- Benchmarking Bloom, Cuckoo, Morton, and PD based filter.☆15Mar 19, 2022Updated 4 years ago
- 🕹️ Group and deduplicate concurrent tasks☆31May 15, 2026Updated last month
- Implementation of some rolling hashes in go☆68Updated this week
- Rabin hashing and content-defined chunking for Go☆20Sep 11, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Tcl/Tk based Git history browser☆22Jun 23, 2026Updated last week
- An algorithm used to classify the cell types from single cell rna sequence data.☆10Jan 24, 2021Updated 5 years ago
- Switch between git worktrees with speed.☆15Jun 22, 2026Updated last week
- JotFS, a content-defined deduplicating file store☆21Feb 25, 2023Updated 3 years ago
- ☆16Aug 9, 2025Updated 10 months ago
- Tutorials For Data Analysis in Julia☆11Dec 6, 2024Updated last year
- Latest PASTE (NSDI'18) repository☆13May 2, 2022Updated 4 years ago