isEmmanuelOlowe / llm-cost-estimatorLinks
Estimating hardware and cloud costs of LLMs and transformer projects
β17Updated last week
Alternatives and similar repositories for llm-cost-estimator
Users that are interested in llm-cost-estimator are comparing it to the libraries listed below
Sorting:
- π©π€π€ A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)β23Updated 2 years ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optunaβ53Updated 4 months ago
- Repository for CPU Kernel Generation for LLM Inferenceβ26Updated last year
- β36Updated last month
- Self-host LLMs with LMDeploy and BentoMLβ20Updated 2 weeks ago
- Latent Large Language Modelsβ18Updated 10 months ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMsβ87Updated this week
- β41Updated 2 weeks ago
- π· Build compute kernelsβ68Updated this week
- Compression for Foundation Modelsβ31Updated 3 months ago
- Cascade Speculative Draftingβ29Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and minβ¦β26Updated 7 months ago
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)β79Updated 5 months ago
- β48Updated 11 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.β36Updated last year
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Modelsβ22Updated 7 months ago
- β68Updated this week
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zetaβ13Updated 7 months ago
- Data preparation code for CrystalCoder 7B LLMβ45Updated last year
- β39Updated 2 years ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundryβ42Updated last year
- Github repo for Peifeng's internship projectβ13Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training dataβ30Updated 9 months ago
- A collection of reproducible inference engine benchmarksβ31Updated 2 months ago
- Make triton easierβ46Updated last year
- Visualize expert firing frequencies across sentences in the Mixtral MoE modelβ18Updated last year
- new optimizerβ20Updated 10 months ago
- MPI Code Generation through Domain-Specific Language Modelsβ14Updated 7 months ago
- minimal C implementation of speculative decoding based on llama2.cβ23Updated 11 months ago
- Unit Scaling demo and experimentation codeβ16Updated last year