basetenlabs / truss-examples
Examples of models deployable with Truss
β155Updated this week
Alternatives and similar repositories for truss-examples:
Users that are interested in truss-examples are comparing it to the libraries listed below
- β109Updated last month
- π | Python library for RunPod API and serverless worker SDK.β205Updated 2 weeks ago
- The one who calls upon functions - Function-Calling Language Modelβ36Updated last year
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.β273Updated this week
- A pipeline parallel training script for LLMs.β121Updated this week
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAIβ222Updated 9 months ago
- A curated list of amazing RunPod projects, libraries, and resourcesβ104Updated 5 months ago
- β46Updated 9 months ago
- A memory framework for Large Language Models and Agents.β174Updated last month
- Experimental LLM Inference UX to aid in creative writingβ111Updated last month
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needsβ185Updated last month
- automatically quant GGUF modelsβ152Updated this week
- Gradio UI for a Cog APIβ65Updated 9 months ago
- A fast batching API to serve LLM modelsβ180Updated 9 months ago
- The code we currently use to fine-tune models.β112Updated 8 months ago
- Chat Bot with LLM and Fact Reference. RAG(Retrieval Augmented Generation) and LangChain backedβ129Updated 8 months ago
- β151Updated 6 months ago
- LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores aβ¦β45Updated 2 weeks ago
- π | A simple worker that can be used as a starting point to build your own custom RunPod Endpoint API worker.β89Updated 3 months ago
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.β114Updated 8 months ago
- Low-Rank adapter extraction for fine-tuned transformers modelsβ167Updated 8 months ago
- Live audio chats with AI using Groq Llama3-70b and Deepgram Voiceβ30Updated 9 months ago
- Easy to use, High Performant Knowledge Distillation for LLMsβ40Updated 3 weeks ago
- β35Updated 11 months ago
- β52Updated last year
- β154Updated last week
- LoRA Explorer model to test with LoRAs using Flux.1[Dev] as the base modelβ39Updated 3 months ago
- β‘οΈ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.β135Updated 7 months ago
- β121Updated last week
- After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking aroundβ¦β56Updated 5 months ago