susavlsh10 / ESPN-v1Links

ESPN: Embedding from Storage Pipelined Network. GDS implementation for multi-vector embedding retrieval and bindings.

☆11

Alternatives and similar repositories for ESPN-v1

Users that are interested in ESPN-v1 are comparing it to the libraries listed below

Sorting:

CosimoRulli / emvb
Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024
☆62Updated 9 months ago
RAIVNLab / AdANNS
Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"
☆65Updated last year
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆32Updated 2 months ago
eth-easl / deltazip
Compression for Foundation Models
☆33Updated 3 months ago
pisa-engine / BMP
Faster Learned Sparse Retrieval with Block-Max Pruning. ACM SIGIR 2024.
☆31Updated 2 weeks ago
UmerHA / triton_util
Make triton easier
☆47Updated last year
eth-easl / mixtera
A lightweight, user-friendly data-plane for LLM training.
☆20Updated 2 weeks ago
zhisbug / ray-scalable-ml-design
Some microbenchmarks and design docs before commencement
☆12Updated 4 years ago
IST-DASLab / SparseFinetuning
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
☆42Updated last year
mayank31398 / ladder-residual-inference
☆14Updated this week
tyler-griggs / melange-release
☆47Updated last year
stanford-futuredata / colbert-serve
☆18Updated last month
zenrran4nlp / Awesome-LLM-Inference-Serving
☆36Updated 2 months ago
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆89Updated this week
katsumiok / pyaskit
AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)
☆79Updated 6 months ago
ant-louis / xm-retrievers
🌏 Modular retrievers for zero-shot multilingual IR.
☆28Updated last year
SqueezeAILab / Tool2Vec
Efficient and Scalable Estimation of Tool Representations in Vector Space
☆25Updated 10 months ago
uiuc-kang-lab / agentic-benchmarks
☆28Updated last week
tanyuqian / redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
☆66Updated 7 months ago
Zyphra / Zyda_processing
☆36Updated last year
facebookresearch / vector_db_id_compression
Implementation of the paper "Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search" by Severo et al.
☆80Updated 5 months ago
IST-DASLab / Sparse-Marlin
Boosting 4-bit inference kernels with 2:4 Sparsity
☆80Updated 10 months ago
ytgui / PilotANN
Memory-Bounded GPU Acceleration for Vector Search
☆26Updated 3 months ago
project-etalon / etalon
LLM Serving Performance Evaluation Harness
☆79Updated 4 months ago
pygongnlp / CoSearchAgent
[SIGIR 2024 (Demo)] CoSearchAgent: A Lightweight Collborative Search Agent with Large Language Models
☆27Updated last year
microsoft / AutoMoE
AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
☆47Updated 2 years ago
shreyansh26 / Attention-Mask-Patterns
Using FlexAttention to compute attention with different masking patterns
☆44Updated 9 months ago
uw-mad-dash / decoding-speculative-decoding
☆12Updated 11 months ago
bespokelabsai / verifiers
Verifiers for LLM Reinforcement Learning
☆65Updated 3 months ago
sher222 / LeReT
Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
☆39Updated 8 months ago