A straightforward method to reduce your LLM inference API costs and token usage.
☆24May 18, 2025Updated last year
Alternatives and similar repositories for save-llm-api-cost
Users that are interested in save-llm-api-cost are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Handling Big Data with Knowledge Graph: A Detailed Guide☆30May 11, 2025Updated last year
- This project promulgates an automated end-to-end ML pipeline that trains a biLSTM network for sentiment analysis, experiment tracking, be…☆16Feb 1, 2023Updated 3 years ago
- ☆25Jul 29, 2025Updated 9 months ago
- A list of developer portfolios for your inspiration☆15Nov 1, 2024Updated last year
- Mixture of Experts from scratch☆14Apr 12, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A minimal PyTorch implementation of BERT (Bidirectional Encoder Representations from Transformers)☆12Mar 20, 2023Updated 3 years ago
- Training framework for Large Behavioral Models☆28Sep 17, 2025Updated 8 months ago
- Minimal TPU implementation with 8x8 systolic array and PyTorch integration☆61Jan 26, 2026Updated 3 months ago
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- [KDD 2025] Fine-tuning Multimodal Large Language Models for Product Bundling☆15Sep 20, 2025Updated 8 months ago
- Tiktok is an advanced multimedia recommender system that fuses the generative modality-aware collaborative self-augmentation and contrast…☆14Aug 18, 2023Updated 2 years ago
- Multi-Agent LLM System for Digital Scam Protection☆15Dec 19, 2024Updated last year
- ☆67Mar 13, 2026Updated 2 months ago
- Recommender system☆26Oct 13, 2020Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A Step-by-Step Implementation of Google Veo 3 Architecture from Scratch☆83Jun 16, 2025Updated 11 months ago
- Exploring advanced prompting tools to query SQL database with multiple tables in natural language using LLMs☆16Aug 23, 2024Updated last year
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆18Jun 28, 2024Updated last year
- This project is my attempt at automating work in Notion.☆17Aug 28, 2025Updated 8 months ago
- The purpose of this repository is to discuss on Audio transformers☆14Apr 16, 2026Updated last month
- Open-source, knowledge-grounded conversational assistant☆14Jun 30, 2025Updated 10 months ago
- [EMNLP 2024] Enhancing High-order Interaction Awareness in LLM-based Recommender Model.☆13Jan 9, 2025Updated last year
- G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable Recommendation☆20Mar 5, 2025Updated last year
- A comprehensive hands-on project for learning GPU programming with CUDA and HIP, covering fundamental concepts through advanced optimizat…☆35Nov 20, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A curation of awesome portfolio website ideas for developers and designers to draw inspiration from. Raise a pull request to add more. 💜…☆17Apr 15, 2025Updated last year
- A Pytorch tutorial of Conditional Flow Matching[Lipman22] using MNIST dataset.☆32Aug 26, 2025Updated 8 months ago
- A Transformer Model Exploiting Histology Images and Spatial Gene Expression☆22Mar 18, 2025Updated last year
- Translate Nvidia Cg shading source code to Open GL Shading source code☆13Apr 23, 2014Updated 12 years ago
- LLM-guided hyperparameter tuning☆10Oct 7, 2023Updated 2 years ago
- AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation (EMNLP 2024 Findings)☆16Dec 30, 2024Updated last year
- An easy auto framework☆11Nov 14, 2023Updated 2 years ago
- Explore from keyword search to dense retrieval and reranking, which injects the intelligence of LLMs into your search system, making it f…☆14Aug 27, 2023Updated 2 years ago
- Python FastApi "Circuit Breaker" implementation☆13Mar 14, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Al-Qur'an yang dikemas dalam bentuk ChatBot☆15Dec 1, 2020Updated 5 years ago
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 6 years ago
- AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧☆10Aug 30, 2024Updated last year
- A lightweight Python library for running TTS models with a unified API.☆20Feb 18, 2025Updated last year
- Official source code for AAAI 2025 paper: Augmenting Sequential Recommendation with Balanced Relevance and Diversity☆25Apr 16, 2025Updated last year
- The LangChain wrapper of Milvus vector database for efficient vector search, full-text search, hybrid retrieval and RAG.☆54Apr 28, 2026Updated 3 weeks ago
- A playground for experimenting with acoustic echo cancellation using a microphone, speaker, and ONNX.☆13Oct 22, 2024Updated last year