FareedKhan-dev / save-llm-api-costLinks
A straightforward method to reduce your LLM inference API costs and token usage.
☆12Updated last month
Alternatives and similar repositories for save-llm-api-cost
Users that are interested in save-llm-api-cost are comparing it to the libraries listed below
Sorting:
- ☆29Updated last year
- Code Repository for Blog - How to Productionize Large Language Models (LLMs)☆11Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- ☆14Updated 2 years ago
- ☆46Updated 9 months ago
- AI Multi-agent system for real-time, adaptive supply chain coordination and optimization leveraging responsive AI clusters.☆18Updated last year
- ☆13Updated last year
- Multi-Agent LLM System for Digital Scam Protection☆10Updated 6 months ago
- ☆26Updated 2 weeks ago
- Modified Beam Search with periodical restart☆12Updated 9 months ago
- This is a repository for the course "From Beginner to LLM Developer" by Towards AI.☆11Updated 5 months ago
- Build Agentic workflows with function calling using open LLMs☆28Updated 3 weeks ago
- Tools for merging pretrained large language models.☆19Updated last year
- Composition of Multimodal Language Models From Scratch☆14Updated 10 months ago
- 100 Days of GPU Challenge☆20Updated 3 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 11 months ago
- Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.☆17Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Advanced Coding AI Assistant that uses a Gradio interface to stream coding related responses. ChatRAG supports local and API inference an…☆22Updated last month
- ☆92Updated 3 months ago
- Building LLMs from scratch following the book from S. Raschka☆31Updated 2 months ago
- Zephyr 7B beta RAG Demo inside a Gradio app powered by BGE Embeddings, ChromaDB, and Zephyr 7B Beta LLM.☆34Updated last year
- ☆41Updated 6 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated last month
- ☆57Updated 4 months ago
- Agent Watch is an AgentOps monitoring library designed for Crew AI applications.☆18Updated 6 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆24Updated 3 months ago
- Building Knowledge Graph-Driven Chatbot with ChatGPT and ArangoDB☆20Updated last year
- ☆20Updated last year
- Tiktok is an advanced multimedia recommender system that fuses the generative modality-aware collaborative self-augmentation and contrast…☆13Updated last year