premAI-io / serverless-examplesLinks
π End-to-end examples and analysis of deploying LLMs serverless using Modal, Runpod, and Beam
β28Updated last year
Alternatives and similar repositories for serverless-examples
Users that are interested in serverless-examples are comparing it to the libraries listed below
Sorting:
- Cerule - A Tiny Mighty Vision Modelβ68Updated last month
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectioβ¦β84Updated last year
- Build Agentic workflows with function calling using open LLMsβ28Updated last week
- Using modal.com to process FineWeb-edu dataβ20Updated 8 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β51Updated last year
- β51Updated 2 years ago
- β19Updated 2 years ago
- BH hackathonβ14Updated last year
- β20Updated last year
- A high performance batching router optimises max throughput for text inference workloadβ16Updated 2 years ago
- β30Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β68Updated 3 weeks ago
- GRDN.AI app for garden optimizationβ69Updated 2 weeks ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ115Updated 8 months ago
- β117Updated 11 months ago
- Notebooks using the Neural Magic libraries πβ39Updated last year
- Simple program to manually caption your images (or any other file types) so you can use them for AI trainingβ37Updated 2 years ago
- Not financial advice.β28Updated 2 years ago
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crewβ¦β59Updated last year
- Machine Learning Serving focused on GenAI with simplicity as the top priority.β59Updated 2 months ago
- alternative way to calculating self attentionβ18Updated last year
- Fast-track AI apps to production with LLaMA 3, Mistral, and other top LLMs!β21Updated last year
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Modelsβ22Updated last year
- LlamaWorksDB is a Retrieval Augmented Generation (RAG) product designed to interact with the documentation of various products such as Llβ¦β17Updated last year
- β36Updated last year
- Data extraction with LLM on CPUβ68Updated 2 years ago
- β17Updated last year
- Gradio UI for a Cog APIβ71Updated last year
- β31Updated 10 months ago
- π Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platformβ38Updated last year