A straightforward method to reduce your LLM inference API costs and token usage.
☆22May 18, 2025Updated 10 months ago
Alternatives and similar repositories for save-llm-api-cost
Users that are interested in save-llm-api-cost are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of contextual engineering pipeline with LangChain and LangGraph Agents☆85Jul 29, 2025Updated 7 months ago
- Handling Big Data with Knowledge Graph: A Detailed Guide☆29May 11, 2025Updated 10 months ago
- ☆11Sep 8, 2025Updated 6 months ago
- This project promulgates an automated end-to-end ML pipeline that trains a biLSTM network for sentiment analysis, experiment tracking, be…☆15Feb 1, 2023Updated 3 years ago
- ☆21Jul 29, 2025Updated 7 months ago
- ☆51Mar 13, 2026Updated last week
- Different Types of Prompt Engineering Techniques☆55May 13, 2025Updated 10 months ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- ☆10Jun 22, 2022Updated 3 years ago
- A minimal PyTorch implementation of BERT (Bidirectional Encoder Representations from Transformers)☆12Mar 20, 2023Updated 3 years ago
- Minimal TPU implementation with 8x8 systolic array and PyTorch integration☆56Jan 26, 2026Updated last month
- A Beginner's Guide to Monetizing Your Python AI Chatbot☆16Apr 22, 2025Updated 11 months ago
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- Composition of Multimodal Language Models From Scratch☆15Aug 16, 2024Updated last year
- From a+b to sparsemax(QK^T)V in Triton!☆28Jun 19, 2025Updated 9 months ago
- Recommender system☆23Oct 13, 2020Updated 5 years ago
- Tiktok is an advanced multimedia recommender system that fuses the generative modality-aware collaborative self-augmentation and contrast…☆14Aug 18, 2023Updated 2 years ago
- The official pytorch implementation of our proposed model MISSL (ICDE-24).☆13Dec 8, 2023Updated 2 years ago
- Multi-Agent LLM System for Digital Scam Protection☆12Dec 19, 2024Updated last year
- solve text generation tasks by the language model GPT2, including papers, code, demo demos, and hands-on tutorials. 使用语言模型GPT2来解决文本生成任务的…☆26Aug 27, 2019Updated 6 years ago
- Calculate allowed interactions in QED☆10Nov 2, 2022Updated 3 years ago
- This project is my attempt at automating work in Notion.☆17Aug 28, 2025Updated 6 months ago
- The purpose of this repository is to discuss on Audio transformers☆14Mar 12, 2026Updated last week
- A comprehensive hands-on project for learning GPU programming with CUDA and HIP, covering fundamental concepts through advanced optimizat…☆35Nov 20, 2025Updated 4 months ago
- [EMNLP 2024] Enhancing High-order Interaction Awareness in LLM-based Recommender Model.☆13Jan 9, 2025Updated last year
- G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable Recommendation☆20Mar 5, 2025Updated last year
- Translate Nvidia Cg shading source code to Open GL Shading source code☆13Apr 23, 2014Updated 11 years ago
- A Pytorch tutorial of Conditional Flow Matching[Lipman22] using MNIST dataset.☆29Aug 26, 2025Updated 6 months ago
- TensorRT☆11Sep 22, 2020Updated 5 years ago
- This is the official code for the ACL 2025 paper "GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion".☆30Aug 30, 2025Updated 6 months ago
- LLM-guided hyperparameter tuning☆10Oct 7, 2023Updated 2 years ago
- Learn RL Techniques in 3 Easy Projects☆18Oct 16, 2024Updated last year
- Code Repository for Blog - How to Productionize Large Language Models (LLMs)☆12Mar 27, 2024Updated last year
- AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation (EMNLP 2024 Findings)☆16Dec 30, 2024Updated last year
- A Snowflake SQL parser (WIP)☆11May 31, 2020Updated 5 years ago
- A ChatGPT clone created with NextJs, TailwindCSS, Typescript, Firebase for Google-Authentication & Realtime Database, Vercel SWR for Data…☆10Sep 18, 2023Updated 2 years ago
- Explore from keyword search to dense retrieval and reranking, which injects the intelligence of LLMs into your search system, making it f…☆14Aug 27, 2023Updated 2 years ago
- Python FastApi "Circuit Breaker" implementation☆13Mar 14, 2025Updated last year
- This is a simple user interface for YOLOv8, a popular object detection system. The program allows the user to select a video or image fil…☆11Apr 4, 2023Updated 2 years ago