inferless / triton-co-pilot
Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
β19Updated 9 months ago
Alternatives and similar repositories for triton-co-pilot:
Users that are interested in triton-co-pilot are comparing it to the libraries listed below
- β12Updated last year
- The official evaluation suite and dynamic data release for MixEval.β11Updated 6 months ago
- NLP with Rust for Python π¦πβ61Updated 10 months ago
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedbackβ75Updated 3 weeks ago
- Repository containing awesome resources regarding Hugging Face tooling.β46Updated last year
- GraphRag vs Embeddingsβ13Updated 8 months ago
- Run LLMs on Replicate with vLLMβ16Updated 5 months ago
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector searchβ23Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.β37Updated 11 months ago
- Creating Generative AI Apps which workβ17Updated 8 months ago
- Self-host LLMs with vLLM and BentoMLβ97Updated this week
- β14Updated last month
- π Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platformβ37Updated last year
- FalkorDB-Browser is a visualization UI for FalkorDB.β29Updated this week
- Efficient BM25 with DuckDB π¦β44Updated 3 months ago
- End-to-End LLM Guideβ104Updated 9 months ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeedβ34Updated last year
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.β43Updated 7 months ago
- β11Updated 2 months ago
- ColBERT for live vector indexesβ22Updated 5 months ago
- Set of scripts to finetune LLMsβ37Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training dataβ30Updated 6 months ago
- β22Updated this week
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram andβ¦β19Updated last week
- β16Updated 2 months ago
- Vector Database with support for late interaction and token level embeddings.β53Updated 6 months ago
- A text embedding extension for the Polars Dataframe library.β24Updated 4 months ago
- LLM Compression Benchmarkβ21Updated last month
- πΉοΈ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.β135Updated 8 months ago
- Lightweight Llama 3 8B Inference Engine in CUDA Cβ47Updated last week