basetenlabs / Workshop-TRT-LLMLinks
☆19Updated last year
Alternatives and similar repositories for Workshop-TRT-LLM
Users that are interested in Workshop-TRT-LLM are comparing it to the libraries listed below
Sorting:
- Fine-tune an LLM to perform batch inference and online serving.☆115Updated 6 months ago
- A miniature version of Modal☆21Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆112Updated last year
- ☆80Updated last year
- An introduction to LLM Sampling☆79Updated last year
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆57Updated 3 months ago
- Build Agentic workflows with function calling using open LLMs☆28Updated last week
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated last month
- Just a bunch of benchmark logs for different LLMs☆119Updated last year
- ☆89Updated 2 years ago
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆76Updated 9 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- LLM training in simple, raw C/CUDA☆15Updated last year
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆33Updated last year
- ☆148Updated last year
- Train LLM on Hugging Face infra☆67Updated last month
- 👷 Build compute kernels☆192Updated last week
- ☆124Updated last year
- ML/DL Math and Method notes☆64Updated 2 years ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Updated 2 years ago
- Set of scripts to finetune LLMs☆38Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆32Updated last year
- My Gen AI research☆11Updated last year
- Modded vLLM to run pipeline parallelism over public networks☆40Updated 6 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆196Updated 6 months ago
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆141Updated 3 months ago
- ☆23Updated 2 years ago
- ☆19Updated last year
- Collection of autoregressive model implementation☆85Updated 7 months ago
- Train your own SOTA deductive reasoning model☆107Updated 9 months ago