kozistr / triton-grpc-proxy-rsLinks
Proxy server for triton gRPC server that inferences embedding model in Rust
☆21Updated last year
Alternatives and similar repositories for triton-grpc-proxy-rs
Users that are interested in triton-grpc-proxy-rs are comparing it to the libraries listed below
Sorting:
- utilities for loading and running text embeddings with onnx☆44Updated last year
- A high performance batching router optimises max throughput for text inference workload☆16Updated last year
- Tiny inference-only implementation of LLaMA☆93Updated last year
- Unofficial python bindings for the rust llm library. 🐍❤️🦀☆75Updated last year
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆45Updated 10 months ago
- Let's create synthetic textbooks together :)☆75Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated last year
- inference code for mixtral-8x7b-32kseqlen☆101Updated last year
- Curriculum training of instruction-following LLMs with Unsloth☆14Updated 4 months ago
- ☆47Updated last year
- Using modal.com to process FineWeb-edu data☆20Updated 4 months ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- 🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.☆137Updated last year
- Simplex Random Feature attention, in PyTorch☆74Updated last year
- ☆66Updated last year
- ☆130Updated last year
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆38Updated 2 years ago
- Optimizing bit-level Jaccard Index and Population Counts for large-scale quantized Vector Search via Harley-Seal CSA and Lookup Tables☆20Updated 2 months ago
- Implementation of mamba with rust☆88Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 6 months ago
- Verbosity control for AI agents☆64Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 10 months ago
- NLP with Rust for Python 🦀🐍☆64Updated 2 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated 2 years ago
- The DPAB-α Benchmark☆29Updated 6 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated 5 months ago
- ☆137Updated last year
- ☆23Updated 6 months ago
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆193Updated 3 weeks ago