Di-Is / faiss-gpu-wheelsLinks
Unofficial faiss wheel builder for NVIDIA GPU
☆33Updated last month
Alternatives and similar repositories for faiss-gpu-wheels
Users that are interested in faiss-gpu-wheels are comparing it to the libraries listed below
Sorting:
- Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…☆61Updated last year
- Code for Zero-Shot Tokenizer Transfer☆142Updated last year
- [NeurIPS 2025] MergeBench: A Benchmark for Merging Domain-Specialized LLMs☆41Updated 2 weeks ago
- ☆74Updated last year
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆142Updated 3 months ago
- ☆161Updated last year
- Easy modernBERT fine-tuning and multi-task learning☆63Updated 7 months ago
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆133Updated last month
- ☆58Updated 11 months ago
- [TMLR 2026] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆122Updated last year
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.☆56Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆32Updated last year
- ☆75Updated last year
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆62Updated 7 months ago
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆55Updated last year
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆227Updated last year
- ☆37Updated 2 years ago
- Code repository for the paper "MrT5: Dynamic Token Merging for Efficient Byte-level Language Models."☆53Updated 4 months ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆77Updated 9 months ago
- Official implementation of "GPT or BERT: why not both?"☆61Updated 6 months ago
- Official Repository for "Hypencoder: Hypernetworks for Information Retrieval"☆33Updated 4 months ago
- [ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆89Updated last year
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆75Updated 7 months ago
- ☆28Updated last year
- State-of-the-art paired encoder and decoder models (17M-1B params)☆58Updated 6 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆191Updated 9 months ago
- Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lo…☆16Updated last year
- ☆47Updated 8 months ago
- E5-V: Universal Embeddings with Multimodal Large Language Models☆274Updated 2 months ago
- Code for "Merging Text Transformers from Different Initializations"☆20Updated last year