huggingface / dedupe_estimatorLinks
Chunk Dedupe Estimation
☆20Updated 11 months ago
Alternatives and similar repositories for dedupe_estimator
Users that are interested in dedupe_estimator are comparing it to the libraries listed below
Sorting:
- Rust crates for XetHub☆70Updated last year
- Smart reproducible analytical pipeline inspection☆19Updated 3 weeks ago
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆23Updated 7 months ago
- ☆12Updated last year
- Datasette enrichment for analyzing row data using OpenAI's GPT models☆21Updated last year
- Your buddy in the (L)LM space.☆64Updated last year
- ☆12Updated last year
- Python SDK for XetHub☆59Updated last year
- tsellm: LLMs in SQLite and DuckDB☆24Updated 6 months ago
- ColBERT for live vector indexes☆28Updated last year
- Adding Marimo to Datasette☆20Updated 7 months ago
- Neural Solr = Solr 9 + Mighty Inference + Node☆18Updated 3 years ago
- Git scrapers for scraping the fediverse☆16Updated this week
- LLM application tracing based on OpenTelemetry☆15Updated 3 weeks ago
- First token cutoff sampling inference example☆30Updated last year
- Embedding models from Jina AI☆65Updated last year
- Voyage AI Official Python Library☆80Updated last month
- Datasette plugin for searching all searchable tables at once☆26Updated last year
- Tools for various benchmarking scenarios☆32Updated last week
- Radio is a DuckDB extension by Query.Farm that brings real-time event streams into your SQL workflows. It enables DuckDB to receive and s…☆32Updated 3 weeks ago
- Run models distributed as GGUF files using LLM☆78Updated 11 months ago
- ☆40Updated this week
- See how HTTPX, Requests, and AIOHTTP libraries compare for sending network requests and find out which one may fit your case better.☆19Updated last month
- A discovery and compression tool for your Java codebase. Creates a knowledge graph for a LLM context window, efficiently outlining your p…☆26Updated 11 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆25Updated 11 months ago
- Vector Database with support for late interaction and token level embeddings.☆55Updated 4 months ago
- A Pub/Sub for Tables based data integration platform, to discover, publish, modify and consume data effortlessly.☆36Updated 2 weeks ago
- Plugin for LLM adding a Markov chain generating model☆19Updated last year
- Tree-based indexes for neural-search☆32Updated last year
- ☆20Updated last year