michaelfeil / candle-flash-attn-v3Links
β11Updated 4 months ago
Alternatives and similar repositories for candle-flash-attn-v3
Users that are interested in candle-flash-attn-v3 are comparing it to the libraries listed below
Sorting:
- implement llava using candleβ15Updated last year
- π· Build compute kernelsβ68Updated this week
- CLI utility to inspect and explore .safetensors and .gguf filesβ20Updated 2 weeks ago
- Proof of concept for running moshi/hibiki using webrtcβ19Updated 3 months ago
- β12Updated last year
- Rust crate for some audio utilitiesβ24Updated 3 months ago
- GPU based FFT written in Rust and CubeCLβ23Updated 2 weeks ago
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedbackβ95Updated 3 months ago
- β20Updated 8 months ago
- β30Updated 7 months ago
- A collection of optimisers for use with candleβ36Updated last month
- Tensor library for Zigβ11Updated 7 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β33Updated last month
- Simple high-throughput inference libraryβ119Updated last month
- Cray-LM unified training and inference stack.β22Updated 4 months ago
- A high-performance constrained decoding engine based on context free grammar in Rustβ53Updated last month
- Optimizing bit-level Jaccard Index and Population Counts for large-scale quantized Vector Search via Harley-Seal CSA and Lookup Tablesβ20Updated last month
- Read and write tensorboard data using Rustβ21Updated last year
- Transformers provides a simple, intuitive interface for Rust developers who want to work with Large Language Models locally, powered by tβ¦β16Updated this week
- Make triton easierβ46Updated last year
- A complete(grpc service and lib) Rust inference with multilingual embedding support. This version leverages the power of Rust for both GRβ¦β39Updated 10 months ago
- β39Updated 2 years ago
- 8-bit floating point types for Rustβ46Updated 3 months ago
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IPβ95Updated last month
- Experimental compiler for deep learning modelsβ67Updated last month
- Inference engine for GLiNER models, in Rustβ59Updated 2 months ago
- π€ Trade any tensors over the networkβ30Updated last year
- β13Updated last year
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datasβ¦β177Updated this week
- A small python library to run iterators in a separate processβ10Updated last year