jxmorris12 / bm25_pt
minimal pytorch implementation of bm25 (with sparse tensors)
☆90Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for bm25_pt
- ☆66Updated this week
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆77Updated 8 months ago
- code for training & evaluating Contextual Document Embedding models☆119Updated this week
- ☆112Updated this week
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆37Updated 5 months ago
- experiments with inference on llama☆105Updated 5 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆122Updated 8 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆92Updated last year
- ☆38Updated 7 months ago
- Generalist and Lightweight Model for Text Classification☆51Updated last week
- Late Interaction Models Training & Retrieval☆166Updated this week
- Track OpenAI compatible requests to a dataset☆57Updated this week
- The official code repo for "Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations".☆75Updated 10 months ago
- Supercharge huggingface transformers with model parallelism.☆75Updated last month
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- ☆45Updated 2 years ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆98Updated 10 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆61Updated 4 months ago
- ☆26Updated 4 months ago
- NLP with Rust for Python 🦀🐍☆59Updated 5 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆238Updated 4 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated last month
- CLIR version of ColBERT☆65Updated last month
- Truly flash T5 realization!☆54Updated 6 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Updated 8 months ago
- Code for NeurIPS LLM Efficiency Challenge☆54Updated 7 months ago
- ☆95Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆44Updated last year
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆60Updated last year