RahulSChand / llama2.c-for-dummiesLinks

Step by step explanation/tutorial of llama2.c

☆224

Alternatives and similar repositories for llama2.c-for-dummies

Users that are interested in llama2.c-for-dummies are comparing it to the libraries listed below

Sorting:

likejazz / llama3.cuda
llama3.cuda is a pure C/CUDA implementation for Llama 3 model.
☆344Updated 6 months ago
NetEase-FuXi / EETQ
Easy and Efficient Quantization for Transformers
☆202Updated 4 months ago
KyujinHan / Sakura-SOLAR-DPO
Sakura-SOLAR-DPO: Merge, SFT, and DPO
☆116Updated last year
davidkim205 / nox
Efficient fine-tuning for ko-llm models
☆182Updated last year
EleutherAI / oslo
OSLO: Open Source for Large-scale Optimization
☆174Updated 2 years ago
deep-diver / hf-daily-paper-newsletter
Newsletter bot for 🤗 Daily Papers
☆128Updated this week
kroggen / mamba.c
Inference of Mamba models in pure C
☆192Updated last year
Marker-Inc-Korea / RAGchain
Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...
☆283Updated last year
SqueezeBits / Torch-TRTLLM
Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.
☆49Updated 3 months ago
neuralmagic / nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆266Updated last year
StableFluffy / EasyLLMFeaturePorter
1-Click is all you need.
☆62Updated last year
deep-diver / PingPong
manage histories of LLM applied applications
☆90Updated last year
hscspring / llama.np
Inference Llama/Llama2/Llama3 Modes in NumPy
☆21Updated last year
daemyung / practice-triton
삼각형의 실전! Triton
☆16Updated last year
kakaobrain / trident
A performance library for machine learning applications.
☆184Updated 2 years ago
mani-kantap / llm-inference-solutions
A collection of all available inference solutions for the LLMs
☆91Updated 7 months ago
lcw99 / evolve-instruct
evolve llm training instruction, from english instruction to any language.
☆118Updated 2 years ago
leloykun / llama2.cpp
Inference Llama 2 in one file of pure C++
☆84Updated 2 years ago
abetlen / ggml-python
Python bindings for ggml
☆146Updated last year
likejazz / llama3.np
llama3.np is a pure NumPy implementation for Llama 3 model.
☆989Updated 6 months ago
IST-DASLab / qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆277Updated last year
jaymody / speculative-sampling
Simple implementation of Speculative Sampling in NumPy for GPT-2.
☆96Updated 2 years ago
UpstageAI / evalverse
The Universe of Evaluation. All about the evaluation for LLMs.
☆228Updated last year
dnotitia / smoothie-qwen
A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.
☆94Updated 3 months ago
vedantroy / gpu_kernels
☆27Updated last year
lapp0 / lm-inference-engines
Comparison of Language Model Inference Engines
☆232Updated 10 months ago
triton-inference-server / vllm_backend
☆302Updated this week
annotation-ai / python-project-template
Python Project Template
☆68Updated 3 years ago
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆161Updated 2 months ago
intel / neural-speed
An innovative library for efficient LLM inference via low-bit quantization
☆349Updated last year