optimyze / simple_simhash

A pure ANSI-C implementation of calculating a SimHash over 4-byte tuples (including multiplicities) for a given byte stream. Simple and reasonably fast, no dynamic memory allocations (outside of some stack usage). Uses a counting bloom filter to count multiplicities while keeping memory consumption constant.
44Updated 5 years ago

Related projects

Alternatives and complementary repositories for simple_simhash