A straightforward method to reduce your LLM inference API costs and token usage.
☆24May 18, 2025Updated 11 months ago
Alternatives and similar repositories for save-llm-api-cost
Users that are interested in save-llm-api-cost are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 9 months ago
- Handling Big Data with Knowledge Graph: A Detailed Guide☆30May 11, 2025Updated 11 months ago
- ☆11Sep 8, 2025Updated 7 months ago
- This project promulgates an automated end-to-end ML pipeline that trains a biLSTM network for sentiment analysis, experiment tracking, be…☆16Feb 1, 2023Updated 3 years ago
- ☆24Jul 29, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Mixture of Experts from scratch☆13Apr 12, 2024Updated 2 years ago
- ☆10Jun 22, 2022Updated 3 years ago
- FlawlessChips is a C# library that provides gate-level simulation of various 8-bit chips.☆10Mar 15, 2026Updated last month
- A minimal PyTorch implementation of BERT (Bidirectional Encoder Representations from Transformers)☆12Mar 20, 2023Updated 3 years ago
- ☆60Mar 13, 2026Updated last month
- A Beginner's Guide to Monetizing Your Python AI Chatbot☆16Apr 22, 2025Updated last year
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- [KDD 2025] Fine-tuning Multimodal Large Language Models for Product Bundling☆15Sep 20, 2025Updated 7 months ago
- Composition of Multimodal Language Models From Scratch☆15Aug 16, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [2025 ACL Findings] Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization☆25Oct 29, 2025Updated 6 months ago
- The official pytorch implementation of our proposed model MISSL (ICDE-24).☆13Dec 8, 2023Updated 2 years ago
- solve text generation tasks by the language model GPT2, including papers, code, demo demos, and hands-on tutorials. 使用语言模型GPT2来解决文本生成任务的…☆26Aug 27, 2019Updated 6 years ago
- Calculate allowed interactions in QED☆10Nov 2, 2022Updated 3 years ago
- From a+b to sparsemax(QK^T)V in Triton!☆32Jun 19, 2025Updated 10 months ago
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆18Jun 28, 2024Updated last year
- Open-source, knowledge-grounded conversational assistant☆14Jun 30, 2025Updated 10 months ago
- [EMNLP 2024] Enhancing High-order Interaction Awareness in LLM-based Recommender Model.☆13Jan 9, 2025Updated last year
- A curation of awesome portfolio website ideas for developers and designers to draw inspiration from. Raise a pull request to add more. 💜…☆17Apr 15, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Synthetic data generation for evaluating LLM symbolic and logic reasoning☆22Mar 6, 2026Updated last month
- A Pytorch tutorial of Conditional Flow Matching[Lipman22] using MNIST dataset.☆30Aug 26, 2025Updated 8 months ago
- Translate Nvidia Cg shading source code to Open GL Shading source code☆13Apr 23, 2014Updated 12 years ago
- TensorRT☆11Sep 22, 2020Updated 5 years ago
- implement GPT-OSS 20B & 120B C++ inference from scratch on AMD GPUs☆171Oct 25, 2025Updated 6 months ago
- MIT licenced .NET document db with IQueryable support☆25Nov 23, 2013Updated 12 years ago
- 📝🤖 WriteAI - Simplify your writing process with AI. Generate emails 📧, articles 📝, essays 📚, & more with ease. Writing is made easy …☆12Feb 21, 2023Updated 3 years ago
- LLM-guided hyperparameter tuning☆10Oct 7, 2023Updated 2 years ago
- A ChatGPT clone created with NextJs, TailwindCSS, Typescript, Firebase for Google-Authentication & Realtime Database, Vercel SWR for Data…☆10Sep 18, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Explore from keyword search to dense retrieval and reranking, which injects the intelligence of LLMs into your search system, making it f…☆14Aug 27, 2023Updated 2 years ago
- [KDD'25] Flow Matching for Collaborative Filtering☆22Sep 6, 2025Updated 7 months ago
- This is a simple user interface for YOLOv8, a popular object detection system. The program allows the user to select a video or image fil…☆11Apr 4, 2023Updated 3 years ago
- Al-Qur'an yang dikemas dalam bentuk ChatBot☆15Dec 1, 2020Updated 5 years ago
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 5 years ago
- This is the official code for the ACL 2025 paper "GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion".☆31Mar 23, 2026Updated last month
- 中文语料:大量人工标注样本,非常有价值 !!!☆11Aug 15, 2019Updated 6 years ago