basetenlabs / Workshop-TRT-LLMLinks
☆19Updated last year
Alternatives and similar repositories for Workshop-TRT-LLM
Users that are interested in Workshop-TRT-LLM are comparing it to the libraries listed below
Sorting:
- Fine-tune an LLM to perform batch inference and online serving.☆112Updated 3 months ago
- An introduction to LLM Sampling☆79Updated 8 months ago
- A miniature version of Modal☆20Updated last year
- ☆23Updated 2 years ago
- ☆87Updated last year
- ☆80Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆110Updated 11 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆31Updated 11 months ago
- Just a bunch of benchmark logs for different LLMs☆120Updated last year
- ☆124Updated 10 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated last year
- ScalarLM - a unified training and inference stack☆55Updated 3 weeks ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated 3 months ago
- ☆46Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 6 months ago
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆74Updated 5 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Updated last year
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆88Updated this week
- Modded vLLM to run pipeline parallelism over public networks☆38Updated 3 months ago
- ☆66Updated 3 months ago
- experiments with inference on llama☆104Updated last year
- 👷 Build compute kernels☆119Updated this week
- ☆170Updated last year
- ☆19Updated last year
- Simple UI for debugging correlations of text embeddings☆290Updated 3 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆32Updated last year
- A tool for benchmarking LLMs on Modal☆43Updated this week
- Collection of autoregressive model implementation☆86Updated 4 months ago
- ☆88Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆191Updated 3 months ago