huggingface / dedupe_estimatorLinks
Chunk Dedupe Estimation
☆20Updated last year
Alternatives and similar repositories for dedupe_estimator
Users that are interested in dedupe_estimator are comparing it to the libraries listed below
Sorting:
- Rust crates for XetHub☆75Updated last year
- Radio is a DuckDB extension by Query.Farm that brings real-time event streams into your SQL workflows. It enables DuckDB to receive and s…☆35Updated last month
- Smart reproducible analytical pipeline inspection☆21Updated this week
- ☆12Updated last year
- Your buddy in the (L)LM space.☆64Updated last year
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆26Updated 10 months ago
- A Pub/Sub for Tables based data integration platform, to discover, publish, modify and consume data effortlessly.☆38Updated last month
- Python SDK for XetHub☆60Updated last year
- tsellm: LLMs in SQLite and DuckDB☆25Updated 8 months ago
- Vector Database with support for late interaction and token level embeddings.☆54Updated 6 months ago
- ☆11Updated 2 years ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆48Updated last year
- Code and data for the Walert large language model-based chatbot☆12Updated 4 months ago
- Granite 3.1 Language Models☆136Updated 6 months ago
- Modular, open source LLMOps stack that separates concerns: LiteLLM unifies LLM APIs, manages routing and cost controls, and ensures high-…☆129Updated 10 months ago
- Embedding models from Jina AI☆65Updated last year
- Tooling for exact and MinHash deduplication of large-scale text datasets☆51Updated this week
- Datasette enrichment for analyzing row data using OpenAI's GPT models☆21Updated last year
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆18Updated 2 years ago
- Transformer GPU VRAM estimator☆67Updated last year
- ☆21Updated last year
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆36Updated 3 years ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 3 months ago
- ☆12Updated last year
- ☆20Updated 7 months ago
- Efficient BM25 with DuckDB 🦆☆59Updated last year
- This is an opensource project allowing you to compare two LLM's head to head with a given prompt, it has a wide range of supported models…☆25Updated 9 months ago
- See how HTTPX, Requests, and AIOHTTP libraries compare for sending network requests and find out which one may fit your case better.☆20Updated 3 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated last year
- ColBERT for live vector indexes☆28Updated last year