RahulSChand / llama2.c-for-dummiesLinks
Step by step explanation/tutorial of llama2.c
☆225Updated 2 years ago
Alternatives and similar repositories for llama2.c-for-dummies
Users that are interested in llama2.c-for-dummies are comparing it to the libraries listed below
Sorting:
- llama3.cuda is a pure C/CUDA implementation for Llama 3 model.☆350Updated 9 months ago
- Easy and Efficient Quantization for Transformers☆202Updated 7 months ago
- OSLO: Open Source for Large-scale Optimization☆175Updated 2 years ago
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Updated 2 years ago
- Efficient fine-tuning for ko-llm models☆185Updated last year
- 1-Click is all you need.☆63Updated last year
- Newsletter bot for 🤗 Daily Papers☆133Updated 2 weeks ago
- Inference of Mamba and Mamba2 models in pure C☆196Updated last week
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆103Updated 6 months ago
- manage histories of LLM applied applications☆91Updated 2 years ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆21Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated last month
- Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...☆284Updated 2 years ago
- evolve llm training instruction, from english instruction to any language.☆119Updated 2 years ago
- llama3.np is a pure NumPy implementation for Llama 3 model.☆992Updated 9 months ago
- A performance library for machine learning applications.☆184Updated 2 years ago
- An innovative library for efficient LLM inference via low-bit quantization☆352Updated last year
- Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.☆55Updated 6 months ago
- a minimal cache manager for PagedAttention, on top of llama3.☆133Updated last year
- Comparison of Language Model Inference Engines☆239Updated last year
- Inference Llama 2 in one file of pure C++☆87Updated 2 years ago
- 삼각형의 실전! Triton☆16Updated last year
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆99Updated 2 years ago
- The Universe of Evaluation. All about the evaluation for LLMs.☆231Updated last year
- "Learning-based One-line intelligence Owner Network Connectivity Tool"☆15Updated 2 years ago
- Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O☆549Updated 4 months ago
- ☆27Updated 2 years ago
- ☆12Updated last year
- ☆70Updated last year
- ☆15Updated 2 years ago