isEmmanuelOlowe / llm-cost-estimator
Estimating hardware and cloud costs of LLMs and transformer projects
☆11Updated last year
Related projects ⓘ
Alternatives and complementary repositories for llm-cost-estimator
- ☆38Updated 4 months ago
- This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.☆37Updated last month
- Deploy your autonomous agents to production grade environments with 99% Uptime Guarantee, Infinite Scalability, and self-healing.☆27Updated last week
- Compression for Foundation Models☆19Updated last month
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆14Updated 5 years ago
- ☆55Updated 6 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆103Updated last week
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆38Updated 3 weeks ago
- GGUF Quantization of any LLM.☆31Updated 8 months ago
- ☆19Updated last year
- ☆12Updated this week
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆29Updated 6 months ago
- Tools for merging pretrained large language models.☆19Updated 5 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆11Updated last month
- Tutorial to get started with SkyPilot!☆56Updated 6 months ago
- An external retriever for GPTs implemented with Zilliz Cloud Pipelines, a more flexible and economic alternative to default GPTs knowledg…☆16Updated 8 months ago
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆36Updated 9 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆36Updated last year
- BH hackathon☆14Updated 7 months ago
- Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆82Updated last week
- AI Assistant running within your browser.☆45Updated 3 weeks ago
- TensorRT LLM Benchmark Configuration☆11Updated 3 months ago
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆15Updated 2 months ago
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆28Updated 5 months ago
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆11Updated 2 weeks ago
- Python Server for C3 AI app. A project that brings the power of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) with…☆22Updated 10 months ago
- Github repo for Peifeng's internship project☆12Updated last year
- This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow …☆27Updated this week
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆74Updated this week