A straightforward method to reduce your LLM inference API costs and token usage.
☆24May 18, 2025Updated last year
Alternatives and similar repositories for save-llm-api-cost
Users that are interested in save-llm-api-cost are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 10 months ago
- Implementation of contextual engineering pipeline with LangChain and LangGraph Agents☆92Jul 29, 2025Updated 10 months ago
- Handling Big Data with Knowledge Graph: A Detailed Guide☆30May 11, 2025Updated last year
- This project promulgates an automated end-to-end ML pipeline that trains a biLSTM network for sentiment analysis, experiment tracking, be…☆16Feb 1, 2023Updated 3 years ago
- ☆27Jul 29, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A list of developer portfolios for your inspiration☆15Nov 1, 2024Updated last year
- Mixture of Experts from scratch☆14Apr 12, 2024Updated 2 years ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- ☆10Jun 22, 2022Updated 3 years ago
- FlawlessChips is a C# library that provides gate-level simulation of various 8-bit chips.☆10Mar 15, 2026Updated 2 months ago
- A minimal PyTorch implementation of BERT (Bidirectional Encoder Representations from Transformers)☆12Mar 20, 2023Updated 3 years ago
- A Beginner's Guide to Monetizing Your Python AI Chatbot☆17Apr 22, 2025Updated last year
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- [KDD 2025] Fine-tuning Multimodal Large Language Models for Product Bundling☆15Sep 20, 2025Updated 8 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Tiktok is an advanced multimedia recommender system that fuses the generative modality-aware collaborative self-augmentation and contrast…☆14Aug 18, 2023Updated 2 years ago
- [2025 ACL Findings] Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization☆25Oct 29, 2025Updated 7 months ago
- Multi-Agent LLM System for Digital Scam Protection☆15Dec 19, 2024Updated last year
- solve text generation tasks by the language model GPT2, including papers, code, demo demos, and hands-on tutorials. 使用语言模型GPT2来解决文本生成任务的…☆26Aug 27, 2019Updated 6 years ago
- Recommender system☆27Oct 13, 2020Updated 5 years ago
- ☆81Mar 13, 2026Updated 3 months ago
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆18Jun 28, 2024Updated last year
- This project is my attempt at automating work in Notion.☆17Aug 28, 2025Updated 9 months ago
- The purpose of this repository is to discuss on Audio transformers☆14Apr 16, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A Pytorch tutorial of Conditional Flow Matching[Lipman22] using MNIST dataset.☆32Aug 26, 2025Updated 9 months ago
- implement GPT-OSS 20B & 120B C++ inference from scratch on AMD GPUs☆175Oct 25, 2025Updated 7 months ago
- 📝🤖 WriteAI - Simplify your writing process with AI. Generate emails 📧, articles 📝, essays 📚, & more with ease. Writing is made easy …☆12Feb 21, 2023Updated 3 years ago
- LLM-guided hyperparameter tuning☆10Oct 7, 2023Updated 2 years ago
- Code Repository for Blog - How to Productionize Large Language Models (LLMs)☆12Mar 27, 2024Updated 2 years ago
- An easy auto framework☆11Nov 14, 2023Updated 2 years ago
- Learn RL Techniques in 3 Easy Projects☆20Oct 16, 2024Updated last year
- Explore from keyword search to dense retrieval and reranking, which injects the intelligence of LLMs into your search system, making it f…☆14Aug 27, 2023Updated 2 years ago
- Al-Qur'an yang dikemas dalam bentuk ChatBot☆15Dec 1, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This is the official code for the ACL 2025 paper "GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion".☆32Mar 23, 2026Updated 2 months ago
- 中文语料:大量人工标注样本,非常有价值 !!!☆11Aug 15, 2019Updated 6 years ago
- Implementation of our paper, "MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models".☆18Apr 16, 2025Updated last year
- Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning☆28Jun 18, 2025Updated 11 months ago
- [ACM TOMM'2025] "MMHCL: Multi-Modal Hypergraph Contrastive Learning for Recommendation"☆32Aug 13, 2025Updated 10 months ago
- Official source code for AAAI 2025 paper: CoRA: Collaborative Information Perception by Large Language Model's Weights for Recommendatio…☆18Dec 11, 2024Updated last year
- C# library with very fast but not very accurate realisations of System.Math methods.☆12Jun 4, 2017Updated 9 years ago