inferless / triton-co-pilot
Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
☆15Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for triton-co-pilot
- ☆12Updated 6 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆22Updated this week
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆23Updated last week
- LLama implementations benchmarking framework☆12Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆28Updated 2 months ago
- NLP with Rust for Python 🦀🐍☆59Updated 5 months ago
- Alpha-Zero Connect Four NN trained via self play☆13Updated last month
- Repository for go shared libraries (for now).☆10Updated 2 weeks ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆38Updated 2 months ago
- Vector Database with support for late interaction and token level embeddings.☆54Updated last month
- A high throughput, end-to-end RL library for infinite horizon tasks.☆18Updated 5 months ago
- Repository containing awesome resources regarding Hugging Face tooling.☆43Updated 10 months ago
- QLLM: A powerful CLI for seamless interaction with multiple Large Language Models. Simplify AI workflows, streamline development, and unl…☆24Updated last week
- Public reports detailing responses to sets of prompts by Large Language Models.☆26Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆29Updated 6 months ago
- ☆41Updated 2 weeks ago
- Proof of concept for a generative AI application framework powered by WebAssembly and Extism☆14Updated last year
- ☆25Updated 2 months ago
- Github repo for Peifeng's internship project☆12Updated last year
- ☆15Updated last month
- Deploy and Scale LLM-based applications☆26Updated last year
- A visual tool to interpret and understand PyTorch machine learning models☆15Updated 9 months ago
- First token cutoff sampling inference example☆28Updated 10 months ago
- Use Grounding DINO, Segment Anything, and CLIP to label objects in images.☆23Updated 10 months ago
- A Learning Journey: Micrograd in Mojo 🔥☆57Updated last month
- Genetics for Language Models☆12Updated 4 months ago
- A high-performance constrained decoding engine based on context free grammar in Rust☆40Updated 2 weeks ago
- ☆35Updated 3 weeks ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆11Updated 9 months ago
- Benchmarking tool for assessing LLM models' performance across different hardwares☆13Updated 11 months ago