sigridjineth/crisp-py

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sigridjineth/crisp-py)

sigridjineth / crisp-py

The Python Implementation of CRISP: Clustering Multi-Vector Representations for Denoising and Pruning

☆27

Alternatives and similar repositories for crisp-py

Users that are interested in crisp-py are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rasyosef / splade-index
View on GitHub
Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba
☆38Oct 16, 2025Updated 8 months ago
Hugging-Face-KREW / Ko-AgentBench
View on GitHub
☆65Feb 6, 2026Updated 5 months ago
instructkr / reranker-simple-benchmark
View on GitHub
Make running benchmark simple yet maintainable, again. Now only supports Korean-based cross-encoder.
☆35Dec 2, 2025Updated 7 months ago
google-research-datasets / swim-ir
View on GitHub
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆50Nov 13, 2023Updated 2 years ago
kyopark2014 / llm-agent
View on GitHub
It shows how to deploy and use an agent with LLM.
☆19Mar 1, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
JacobHuang91 / prompt-refiner
View on GitHub
🚀 Lightweight Python library for building production LLM applications with smart context management and automatic token optimization. Sa…
☆37Apr 12, 2026Updated 2 months ago
willmil11 / cleanai-c
View on GitHub
Cleanai (https://github.com/willmil11/cleanai) except I'm making it in c now. Fast and clean from the start this time :)
☆15Jun 16, 2026Updated 2 weeks ago
PBDESG / nnViewer
View on GitHub
☆10Jan 23, 2025Updated last year
Zerohertz / Instruct_KR_2025_Summer_Meetup_vLLM
View on GitHub
🎹 Instruct.KR 2025 Summer Meetup: 오픈소스 LLM, vLLM으로 Production까지 🎹
☆23Aug 2, 2025Updated 11 months ago
NiaExperience / PearlOS
View on GitHub
Your Interface to Intelligence
☆55Jun 25, 2026Updated last week
machinelearningZH / hybrid-search-eval
View on GitHub
A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.
☆40Updated this week
TusKANNy / awesome-learned-sparse-retrieval
View on GitHub
An extensive and commented list of resources on Learned Sparse Retrieval.
☆61Jun 12, 2026Updated 3 weeks ago
viig99 / muvfde
View on GitHub
Generate fixed dimensional embeddings for multi-dimensional vectors in python based on Muvera from Google.
☆20Jun 28, 2025Updated last year
NEUIR / ExpandR
View on GitHub
[EMNLP '25] Source code for paper "ExpandR: Teaching Dense Retrievers Beyond Queries with LLM Guidance"
☆40Aug 13, 2025Updated 10 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
thakur-nandan / sprint
View on GitHub
SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.
☆48Jul 25, 2023Updated 2 years ago
Marker-Inc-Korea / AutoRAG-example-korean-embedding-benchmark
View on GitHub
AutoRAG example about benchmarking Korean embeddings.
☆45Oct 2, 2024Updated last year
intfloat / uts
View on GitHub
python package for unsupervised text segmentation.
☆14Oct 31, 2016Updated 9 years ago
Lumen-Labs / cpp-chunker
View on GitHub
Implementation of a fast semantic chunker in C++, installable in python 3.7+ projects.
☆22Sep 20, 2025Updated 9 months ago
ksmin23 / my-adk-python-samples
View on GitHub
A collection of Python agent samples built with the Google Agent Development Kit (ADK), demonstrating integrations with services like B…
☆21May 8, 2026Updated last month
hyunwoongko / nanoRLHF
View on GitHub
nanoRLHF: from-scratch journey into how LLMs and RLHF really work.
☆192Updated this week
mrseanryan / gpt-workflow
View on GitHub
Generate workflows (for flowcharts or low code) via LLM. Also describe workflow given in DOT.
☆19Nov 2, 2023Updated 2 years ago
iliaschalkidis / flash-roberta
View on GitHub
Hugging Face RoBERTa with Flash Attention 2
☆24Sep 14, 2025Updated 9 months ago
samas69420 / transformino
View on GitHub
☆19Jul 4, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
baeseongsu / Clinical-LLM-FineTuning-HandsOn
View on GitHub
Hands-on repository for fine-tuning Large Language Models (LLMs) in the clinical domain with tutorials
☆16Jan 9, 2026Updated 5 months ago
webis-de / rank-distillm
View on GitHub
Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Ranking
☆25Apr 4, 2025Updated last year
astrix-security / mcp-secret-wrapper
View on GitHub
Astrix Security MCP Secret Wrapper
☆50May 8, 2026Updated last month
thakur-nandan / income
View on GitHub
INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.
☆24Sep 24, 2023Updated 2 years ago
DSBA-Lab / Contrastive-Accumulation
View on GitHub
☆14Jul 7, 2024Updated last year
Seokii / Chatbot4Univ
View on GitHub
대학생을 위한 AI 질의응답 챗봇 만들기
☆38Apr 21, 2023Updated 3 years ago
sdsc / sdsc-summer-institute-2017
View on GitHub
SDSC Summer Institute 2017 teaching material
☆17Jul 12, 2018Updated 7 years ago
stys / kaggle-talkingdata-adtracking-fraud-detection
View on GitHub
TalkingData AdTracking Fraud Detection Challenge
☆10May 8, 2018Updated 8 years ago
sbnb-io / gemma3n-profiling
View on GitHub
Profiling Google Gemma 3n Model Using PyTorch Profiler
☆17Jul 7, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
snunlp / KR-SBERT
View on GitHub
KoRean based SBERT pre-trained models (KR-SBERT) for PyTorch
☆105May 3, 2022Updated 4 years ago
LeeSureman / E5-Retrieval-Reproduction
View on GitHub
Use contrastive learning to train a large language model (LLM) as a retriever
☆12Jul 19, 2024Updated last year
vi3k6i5 / learning_ml
View on GitHub
I am teaching a Learning ML workshop for some folks @ Belong.co. Creating this repo to organise the course material.
☆22May 4, 2018Updated 8 years ago
automl / TempoPFN
View on GitHub
TempoPFN: Zero-shot Time Series Forecasting (accepted at EurIPS 2025 AI for Tabular Data Workshop)
☆41Nov 10, 2025Updated 7 months ago
xhluca / bm25-benchmarks
View on GitHub
☆24Apr 29, 2026Updated 2 months ago
KimSoungRyoul / PyConKR2023-ModelServing-BentoML
View on GitHub
Pycon KR 2023 presentation
☆13Feb 7, 2024Updated 2 years ago
sionic-ai / muvera-py
View on GitHub
Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)
☆417Dec 10, 2025Updated 6 months ago