rom1504 / awesome-semantic-search
Semantic search with embeddings: index anything
β140Updated 3 years ago
Alternatives and similar repositories for awesome-semantic-search:
Users that are interested in awesome-semantic-search are comparing it to the libraries listed below
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- [WIP] A π₯ interface for running code in the cloudβ86Updated last year
- Scripts to convert datasets from various sources to Hugging Face Datasets.β58Updated 2 years ago
- π€ Disaggregators: Curated data labelers for in-depth analysis.β65Updated 2 years ago
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 qβ¦β86Updated 11 months ago
- β90Updated 8 months ago
- Completion After Prompt Probability. Make your LLM make a choiceβ74Updated 3 months ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP modelsβ¦β37Updated 2 years ago
- Few-shot Named Entity Recognitionβ123Updated 2 years ago
- Efficient few-shot learning with cross-encoders.β48Updated last year
- β42Updated 2 years ago
- π« SpaCy wrapper for ConceptNet π«β89Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)β60Updated last year
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engineβ242Updated last year
- A curated list of awesome resources related to Semantic Searchπ and Semantic Similarity tasks.β349Updated last year
- Tutorial to pretrain & fine-tune a π€ Flax T5 model on a TPUv3-8 with GCPβ58Updated 2 years ago
- Simply, faster, sentence-transformersβ141Updated 5 months ago
- Reimplementation of the task generation part from the Alpaca paperβ119Updated last year
- Pipeline for pulling and processing online language model pretraining data from the webβ175Updated last year
- Neural information retrieval / Semantic search / Bi-encodersβ169Updated last year
- RaKUn 2.0 - A fast keyword detection algorithmβ65Updated this week
- Build Semantic Search with S-BERT and Fine-tune your model in unsupervised wayβ58Updated 2 years ago
- A Streamlit application to visualize sentence embeddingsβ19Updated 2 years ago
- Repository containing awesome resources regarding Hugging Face tooling.β46Updated last year
- β20Updated 3 years ago
- β31Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β34Updated 2 months ago
- A library to synthesize text datasets using Large Language Models (LLM)β151Updated 2 years ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β67Updated 4 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β104Updated 9 months ago