michaelfeil / candle-flash-attn-v3Links

☆12

Alternatives and similar repositories for candle-flash-attn-v3

Users that are interested in candle-flash-attn-v3 are comparing it to the libraries listed below

Sorting:

chenwanqq / candle-llava
implement llava using candle
☆15Updated last year
huggingface / kernel-builder
👷 Build compute kernels
☆163Updated last week
Oxen-AI / GRPO-With-Cargo-Feedback
This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback
☆107Updated 7 months ago
kyutai-labs / kaudio
Rust crate for some audio utilities
☆25Updated 7 months ago
facebookresearch / fastgen
Simple high-throughput inference library
☆147Updated 5 months ago
kyutai-labs / moshi-webrtc
Proof of concept for running moshi/hibiki using webrtc
☆19Updated 7 months ago
eugenehp / gpu-fft
GPU based FFT written in Rust and CubeCL
☆24Updated 4 months ago
Dan-wanna-M / kbnf
A high-performance constrained decoding engine based on context free grammar in Rust
☆55Updated 5 months ago
leo-du / llama2.rs
Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust
☆39Updated 2 years ago
cray-lm / cray-lm
Cray-LM unified training and inference stack.
☆22Updated 8 months ago
kanpuriyanawab / picograd
Rust Implementation of micrograd
☆53Updated last year
huggingface / candle-paged-attention
☆12Updated last year
lianakoleva / no-libtorch-compile
☆21Updated 7 months ago
KGrewal1 / candle-optimisers
A collection of optimisers for use with candle
☆43Updated 2 months ago
opendatahub-io / vllm-tgis-adapter
vLLM adapter for a TGIS-compatible gRPC server.
☆41Updated this week
EricLBuehler / safetensors_explorer
CLI utility to inspect and explore .safetensors and .gguf files
☆32Updated 2 months ago
kyutai-labs / yomikomi
A small rust-based data loader
☆31Updated 4 months ago
beowolx / rensa
High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…
☆210Updated 3 weeks ago
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆98Updated 11 months ago
atoma-network / atoma-infer
Fast serverless LLM inference, in Rust.
☆94Updated 7 months ago
UmerHA / triton_util
Make triton easier
☆48Updated last year
LaurentMazare / mamba.rs
☆134Updated last year
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated last month
Narsil / bindgen_cuda
☆24Updated 6 months ago
IBM / text-generation-inference
IBM development fork of https://github.com/huggingface/text-generation-inference
☆61Updated last month
LaurentMazare / tboard-rs
Read and write tensorboard data using Rust
☆23Updated last year
PrimeIntellect-ai / pccl
PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP
☆133Updated last month
argilla-io / distilabel-spin-dibt
Repository containing the SPIN experiments on the DIBT 10k ranked prompts
☆24Updated last year
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆34Updated 6 months ago
frankxwang / dpo-prefix-sharing
DPO, but faster 🚀
☆45Updated 10 months ago