google-research / retvecLinks
RETVec is an efficient, multilingual, and adversarially-robust text vectorizer.
☆292Updated 7 months ago
Alternatives and similar repositories for retvec
Users that are interested in retvec are comparing it to the libraries listed below
Sorting:
- UniSim is a package for efficient similarity computation, fuzzy matching, and clustering of data.☆142Updated 7 months ago
- Your buddy in the (L)LM space.☆64Updated last year
- BlindBox is a tool to isolate and deploy applications inside Trusted Execution Environments for privacy-by-design apps☆62Updated last year
- The Foundation Model Transparency Index☆83Updated last year
- The world's largest social media toxicity dataset.☆187Updated 3 years ago
- Managing the lifecycle of machine learning to support scalability, impact, collaboration, compliance and sharing.☆88Updated this week
- Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.☆26Updated 7 months ago
- ☆116Updated 9 months ago
- Source code for Mozilla.ai's Lumigator platform☆266Updated this week
- Statistics of Common Crawl monthly archives mined from URL index files☆195Updated this week
- Neural Search☆363Updated 7 months ago
- Lightweight Nearest Neighbors with Flexible Backends☆312Updated last month
- LLM for Email Spam Detection☆109Updated 2 years ago
- ☆710Updated 2 months ago
- GPU-Powered Topic Modelling☆69Updated 2 years ago
- Full text search that feels like a numpy array☆264Updated 3 weeks ago
- Fast Text Classification with Compressors dictionary☆149Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- GPT Takes the Bar Exam☆142Updated 2 years ago
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆285Updated last month
- ☆337Updated last year
- Web-scale retrieval for knowledge-intensive NLP☆555Updated 2 years ago
- Common crawl extractor☆80Updated last year
- Definition for Open Weights LIcensing☆144Updated last year
- The AI Knowledge Editor☆185Updated 3 years ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆190Updated last year
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆196Updated 8 months ago
- A fully user-side image search engine. Accepted to CIKM 2022 demo track.☆251Updated 3 years ago
- Extend the original llama.cpp repo to support redpajama model.☆118Updated last year
- Efficient vector database for hundred millions of embeddings.☆208Updated last year