Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster retrieval.
☆19Mar 23, 2024Updated 2 years ago
Alternatives and similar repositories for binary-embeddings
Users that are interested in binary-embeddings are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Jun 25, 2024Updated last year
- WIP: Ofen is a toolkit aimed at making transformer models production-ready. API included☆17Oct 2, 2024Updated last year
- Crispy reranking models by Mixedbread☆50Sep 17, 2025Updated 6 months ago
- Official code for "Binary embedding based retrieval at Tencent"☆44Mar 7, 2024Updated 2 years ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆160Jul 14, 2025Updated 8 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Simple tool for generating tokens with open source transformers and/or calculate per-token surprisal.☆14Updated this week
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆21Nov 19, 2024Updated last year
- Leveraging LLMs for Post-OCR Correction of Historical Newspapers☆15Jun 20, 2024Updated last year
- Chunk your text using gpt4o-mini more accurately☆44Aug 3, 2024Updated last year
- Test-Time Memory Framework: Control Hallucinations in Foundation Models☆11Nov 4, 2025Updated 5 months ago
- ☆10Oct 2, 2024Updated last year
- Yet another dependency parser, integrated with tokenizer, tagger and visualization tool.☆11Mar 18, 2018Updated 8 years ago
- Examples in the MLX framework☆11Sep 23, 2024Updated last year
- This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"☆40Jun 9, 2023Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture☆26Jun 12, 2023Updated 2 years ago
- ComfyUI-Direct3D‑S2 is now available in ComfyUI, Direct3D‑S2 - Gigascale 3D Generation Made Easy with Spatial Sparse Attention. Direct3D‑…☆17Jun 10, 2025Updated 10 months ago
- ☆13Mar 30, 2026Updated last week
- resources, links for OCR & greek☆10Mar 8, 2021Updated 5 years ago
- Simple, efficient and cross-platform TFIDF-based text summarizer in Rust☆13Apr 12, 2024Updated 2 years ago
- Neural Solr = Solr 9 + Mighty Inference + Node☆18Jun 9, 2022Updated 3 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆38Feb 5, 2026Updated 2 months ago
- A tool for extracting plain text from Wikipedia dumps☆15Sep 13, 2018Updated 7 years ago
- Python collections that are backended by sqlite3 DB and are compatible with the built-in collections☆13Jan 26, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Chrome Extension for exploring Hugging Face datasets 🔎☆48Sep 18, 2024Updated last year
- ☆12Apr 29, 2024Updated last year
- Nearest neighbor search algorithms including a ball tree and a vantage point tree.☆12Dec 2, 2025Updated 4 months ago
- A Keras-based and TensorFlow-backend NLP Models Toolkit.☆12Jul 7, 2022Updated 3 years ago
- BERT Probe: A python package for probing attention based robustness to character and word based adversarial evaluation. Also, with recipe…☆18Jun 24, 2022Updated 3 years ago
- Minimalistic REST API for wake-on-lan☆11Nov 1, 2017Updated 8 years ago
- ☆18Dec 22, 2025Updated 3 months ago
- Revamped: Hugo+LoveIt☆10Mar 14, 2026Updated 3 weeks ago
- Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"☆17Mar 29, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Transfer.sh command line program, Now file sharing from the command line is easy.☆13Feb 28, 2023Updated 3 years ago
- Immutable development environments for PyTorch powered by Visual Studio Code Dev Containers☆11Feb 15, 2023Updated 3 years ago
- ☆11Apr 25, 2021Updated 4 years ago
- ☆11Nov 10, 2020Updated 5 years ago
- A set of visualization engines.☆14Updated this week
- This fully reconfigurable action, validates conformity with Azure Developer CLI template standards.☆21Updated this week
- Scout - commmandline tool for command-not-found operations☆13Mar 8, 2026Updated last month