atoma-network / atoma-infer
Fast serverless LLM inference, in Rust.
☆22Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for atoma-infer
- A high-performance constrained decoding engine based on context free grammar in Rust☆40Updated 2 weeks ago
- A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face☆27Updated 6 months ago
- Dataflow is a data processing library, primarily for machine learning.☆19Updated last year
- Asynchronous CUDA for Rust.☆27Updated 2 weeks ago
- An extension library to Candle that provides PyTorch functions not currently available in Candle☆37Updated 8 months ago
- ☆17Updated last month
- Low rank adaptation (LoRA) for Candle.☆127Updated 3 months ago
- A neural network inference library, written in Rust.☆56Updated 4 months ago
- Unofficial Rust bindings to Apple's mlx framework☆68Updated this week
- Run Generative AI models directly on your hardware☆22Updated 3 months ago
- A collection of optimisers for use with candle☆31Updated this week
- ☆19Updated 4 months ago
- Implementing the BitNet model in Rust☆28Updated 7 months ago
- 8-bit floating point types for Rust☆39Updated last month
- Andrej Karpathy's Let's build GPT: from scratch video & notebook implemented in Rust + candle☆61Updated 7 months ago
- Example of tch-rs on M1☆51Updated 8 months ago
- A Keras like abstraction layer on top of the Rust ML framework candle☆22Updated 5 months ago
- Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.☆265Updated last month
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆65Updated 3 months ago
- Library for doing RAG☆42Updated last week
- implement llava using candle☆13Updated 5 months ago
- LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!☆101Updated last year
- ☆123Updated 6 months ago
- Asynchronous TensorRT for Rust.☆20Updated 2 weeks ago
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆37Updated last year
- Structured outputs for LLMs☆31Updated 4 months ago
- Rust port of llm.c by @karpathy☆38Updated 7 months ago
- Rust client for the huggingface hub aiming for minimal subset of features over `huggingface-hub` python package☆153Updated 2 months ago
- ☆22Updated this week
- ☆76Updated 5 months ago