ytgui / PilotANN
Memory-Bounded GPU Acceleration for Vector Search
☆19Updated this week
Alternatives and similar repositories for PilotANN:
Users that are interested in PilotANN are comparing it to the libraries listed below
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 5 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Training hybrid models for dummies.☆20Updated 2 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 9 months ago
- Cascade Speculative Drafting☆29Updated last year
- ☆15Updated 6 months ago
- ☆18Updated last month
- ☆19Updated 3 weeks ago
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆63Updated last year
- MPI Code Generation through Domain-Specific Language Models☆13Updated 4 months ago
- ☆15Updated last year
- BH hackathon☆14Updated 11 months ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated last year
- Compression for Foundation Models☆30Updated last week
- Aioli: A unified optimization framework for language model data mixing☆22Updated 2 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated 4 months ago
- Utilities for Training Very Large Models☆58Updated 6 months ago
- ☆46Updated 5 months ago
- Implementation of Spectral State Space Models☆16Updated last year
- Lottery Ticket Adaptation☆39Updated 4 months ago
- Using FlexAttention to compute attention with different masking patterns☆42Updated 6 months ago
- ☆13Updated last week
- Train, tune, and infer Bamba model☆87Updated 2 months ago
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated last week
- Latent Large Language Models☆17Updated 7 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Port of Facebook's LLaMA model in C/C++☆20Updated last year
- ☆16Updated last month
- ☆48Updated 4 months ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Updated last year