basetenlabs / Workshop-TRT-LLMLinks
☆19Updated last year
Alternatives and similar repositories for Workshop-TRT-LLM
Users that are interested in Workshop-TRT-LLM are comparing it to the libraries listed below
Sorting:
- Fine-tune an LLM to perform batch inference and online serving.☆113Updated 5 months ago
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆51Updated last month
- A miniature version of Modal☆20Updated last year
- Modded vLLM to run pipeline parallelism over public networks☆39Updated 5 months ago
- An introduction to LLM Sampling☆79Updated 10 months ago
- Just a bunch of benchmark logs for different LLMs☆118Updated last year
- ☆23Updated 2 years ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆58Updated last week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆50Updated last year
- ☆88Updated 2 years ago
- ☆80Updated last year
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆73Updated 7 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated 2 years ago
- ML/DL Math and Method notes☆64Updated last year
- Build Agentic workflows with function calling using open LLMs☆28Updated 3 weeks ago
- I learn about and explain quantization☆26Updated last year
- Collection of autoregressive model implementation☆86Updated 6 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆33Updated last year
- Simple UI for debugging correlations of text embeddings☆296Updated 5 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
- ☆136Updated 2 months ago
- Submodule of evalverse forked from [google-research/instruction_following_eval](https://github.com/google-research/google-research/tree/m…☆14Updated last year
- Because it's there.☆16Updated last year
- ☆159Updated 10 months ago
- ☆124Updated last year
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆99Updated this week
- Set of scripts to finetune LLMs☆38Updated last year
- 👷 Build compute kernels☆163Updated this week
- The official evaluation suite and dynamic data release for MixEval.☆11Updated last year