UniSim is a package for efficient similarity computation, fuzzy matching, and clustering of data.
☆147Apr 4, 2025Updated last year
Alternatives and similar repositories for unisim
Users that are interested in unisim are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Nearly Inference Free Embeddings: make your RAG queries 500x faster☆77Apr 27, 2026Updated 3 weeks ago
- ☆13Feb 22, 2024Updated 2 years ago
- Nadir: Cutting-edge PyTorch optimizers for simplicity & composability! 🔥🚀💻☆14Jun 15, 2024Updated last year
- Monitor data sources and track changes over time 🐿️☆11Nov 7, 2024Updated last year
- Prototype record matching database.☆26Updated this week
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C, Python, JavaScript, Rust, Java, Objective-C, S…☆4,110May 2, 2026Updated 3 weeks ago
- A maximum-strength name parser for record linkage.☆40Sep 3, 2025Updated 8 months ago
- ☆52Jun 14, 2024Updated last year
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Dec 14, 2021Updated 4 years ago
- ☆14Sep 18, 2024Updated last year
- Hugging Face RoBERTa with Flash Attention 2☆24Sep 14, 2025Updated 8 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Nov 13, 2023Updated 2 years ago
- A tf.estimator version of GPT2☆27Jan 29, 2022Updated 4 years ago
- Efficient BM25 with DuckDB 🦆☆68Dec 20, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- If only std::set was a DBMS: collection of templated ACID in-memory exception-free thread-safe and concurrent containers in a header-only…☆44Oct 30, 2025Updated 6 months ago
- An open-source translation agent designed to enhance the quality of text translations by leveraging large language models☆25Mar 28, 2026Updated last month
- Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than…☆1,235Oct 30, 2025Updated 6 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆63Aug 2, 2024Updated last year
- Lightweight Python wrapper around the DuckDB extension, httpserver (extension developed by @quackscience)☆17Sep 24, 2025Updated 8 months ago
- 收集优质的角色扮演聊天数据 | Collection of roleplay conversations of high quality☆15Dec 1, 2024Updated last year
- Example using echo conversational agent server☆15Aug 20, 2024Updated last year
- Pre-training Cross-modal Transformer for Audio-and-Language Representations☆39Apr 20, 2021Updated 5 years ago
- Hacker News Search and RAG built using Rust actix-web, minijinja, SolidJS, Vite, and Redis queue's☆30Dec 11, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Testing paligemma2 finetuning on reasoning dataset☆18Dec 28, 2024Updated last year
- ☆15Oct 15, 2019Updated 6 years ago
- ☆39Nov 7, 2024Updated last year
- Semantic Search demo featuring UForm, USearch, UCall, and StreamLit, to visual and retrieve from image datasets, similar to "CLIP Retriev…☆53Dec 29, 2023Updated 2 years ago
- ☆11Apr 2, 2021Updated 5 years ago
- ☆12Mar 6, 2020Updated 6 years ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆93Apr 15, 2026Updated last month
- The official repo of our research work "Interactive Editing for Text Summarization".☆23Jun 3, 2023Updated 2 years ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆32Jun 13, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Creating Debian Packages from CRAN Sources☆12Jul 1, 2020Updated 5 years ago
- ☆14Dec 21, 2025Updated 5 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆115Oct 30, 2025Updated 6 months ago
- SIMD-accelerated distances, dot products, matrix ops, geospatial & geometric kernels for 16 numeric types — from 6-bit floats to 64-bit c…☆1,811May 10, 2026Updated 2 weeks ago
- Fast Multimodal Semantic Deduplication & Filtering☆926May 4, 2026Updated 3 weeks ago
- BlockRank makes LLMs efficient and scalable for RAG and in-context ranking☆44Dec 12, 2025Updated 5 months ago
- An End-to-End Evaluation Framework for Entity Resolution Systems☆37Dec 3, 2023Updated 2 years ago