premAI-io / serverless-examples
π End-to-end examples and analysis of deploying LLMs serverless using Modal, Runpod, and Beam
β27Updated 7 months ago
Related projects β
Alternatives and complementary repositories for serverless-examples
- Using modal.com to process FineWeb-edu dataβ19Updated 2 months ago
- Simple examples using Argilla tools to build AIβ40Updated this week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β48Updated 4 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β61Updated 2 weeks ago
- β18Updated this week
- Build Agentic workflows with function callingβ20Updated this week
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Modelsβ20Updated 9 months ago
- BH hackathonβ14Updated 7 months ago
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Reβ¦β15Updated 2 months ago
- LlamaWorksDB is a Retrieval Augmented Generation (RAG) product designed to interact with the documentation of various products such as Llβ¦β15Updated 6 months ago
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crewβ¦β58Updated 7 months ago
- OpenMindedChatbot is a Proof Of Concept that leverages the power of Open source Large Language Models (LLM) with Function Calling capabilβ¦β28Updated 11 months ago
- RAG example using DSPy, Gradio, FastAPIβ66Updated 7 months ago
- Testing the different LLM and RAG Tests while I learn along the wayβ17Updated 2 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β53Updated 3 weeks ago
- β1Updated 4 months ago
- Apps that run on modal.comβ12Updated 5 months ago
- β64Updated 5 months ago
- Using multiple LLMs for ensemble Forecastingβ16Updated 10 months ago
- Set of scripts to finetune LLMsβ36Updated 7 months ago
- Data extraction with LLM on CPUβ66Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.β33Updated 8 months ago
- Routing on Random Forest (RoRF)β84Updated last month
- β20Updated 9 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and minβ¦β23Updated last week
- A Python library to orchestrate LLMs in a neural network-inspired structureβ41Updated last month
- Experimental Code for StructuredRAG: Structured Outputs in Retrieval-Augmented Generationβ94Updated this week
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for youβ¦β32Updated 6 months ago
- LLM reads a paper and produce a working prototypeβ36Updated last week
- β48Updated last year