inferless / triton-co-pilotLinks
Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
☆20Updated 11 months ago
Alternatives and similar repositories for triton-co-pilot
Users that are interested in triton-co-pilot are comparing it to the libraries listed below
Sorting:
- ☆13Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 9 months ago
- ☆19Updated 5 months ago
- ☆20Updated 8 months ago
- GraphRag vs Embeddings☆14Updated 11 months ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 9 months ago
- Efficient and Scalable Estimation of Tool Representations in Vector Space☆23Updated 9 months ago
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.☆11Updated 5 months ago
- NLP with Rust for Python 🦀🐍☆62Updated last month
- A collection of open-source large language model (LLM) implementations in JAX & Flax☆23Updated 2 months ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆47Updated last year
- Automated benchmarking of Retrieval-Augmented Generation (RAG) systems☆28Updated this week
- Evolutionary Search for expert-level performance on any task with environmental feedback☆14Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 9 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆80Updated last month
- Pivotal Token Search☆107Updated last month
- ☆12Updated 11 months ago
- Because it's there.☆16Updated 9 months ago
- Code for paper: "Privately generating tabular data using language models".☆15Updated 2 years ago
- ☆51Updated 7 months ago
- 📡 Deploy AI models and apps to Kubernetes without developing a hernia☆32Updated last year
- First token cutoff sampling inference example☆30Updated last year
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆44Updated 10 months ago
- Asynchronous tasks on the cloud☆21Updated last year
- This project implements a demonstrator agent that compares the Cache-Augmented Generation (CAG) Framework with traditional Retrieval-Augm…☆32Updated 5 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 7 months ago
- FalkorDB-Browser is a visualization UI for FalkorDB.☆35Updated this week
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- ☆28Updated 10 months ago