hkproj / mistral-src-commentedLinks

Reference implementation of Mistral AI 7B v0.1 model.

☆28

Alternatives and similar repositories for mistral-src-commented

Users that are interested in mistral-src-commented are comparing it to the libraries listed below

Sorting:

hkproj / lazy-ml
ML algorithms implementations that are good for learning the underlying principles
☆25Updated 10 months ago
TrelisResearch / install-guides
Various installation guides for Large Language Models
☆75Updated 6 months ago
hkproj / multi-latent-attention
☆45Updated 5 months ago
1y33 / 100Days
GPU Kernels
☆203Updated 6 months ago
ThinamXx / Meta-llama
Complete implementation of Llama2 with/without KV cache & inference 🚀
☆48Updated last year
TrelisResearch / one-click-llms
One click templates for inferencing Language Models
☆215Updated 2 months ago
hkproj / pytorch-lora
LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch
☆117Updated 2 years ago
ayulockin / neurips-llm-efficiency-challenge
Starter pack for NeurIPS LLM Efficiency Challenge 2023.
☆126Updated 2 years ago
tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆110Updated last year
anyscale / e2e-llm-workflows
Fine-tune an LLM to perform batch inference and online serving.
☆113Updated 4 months ago
hkproj / pytorch-llama
LLaMA 2 implemented from scratch in PyTorch
☆358Updated 2 years ago
hkproj / mistral-llm-notes
Notes on the Mistral AI model
☆20Updated last year
hkproj / quantization-notes
Notes on quantization in neural networks
☆104Updated last year
hkproj / pytorch-transformer-distributed
Distributed training (multi-node) of a Transformer model
☆85Updated last year
FareedKhan-dev / Building-llama3-from-scratch
LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.
☆186Updated last year
evintunador / triton_docs_tutorials
making the official triton tutorials actually comprehensible
☆57Updated 2 months ago
hkproj / transformer-from-scratch-notes
Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)
☆318Updated 2 years ago
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆205Updated 7 months ago
hkproj / triton-flash-attention
☆210Updated 9 months ago
Pleias / Various-Finetuning
Set of scripts to finetune LLMs
☆38Updated last year
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆194Updated 4 months ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆231Updated 11 months ago
AviSoori1x / seemore
From scratch implementation of a vision language model in pure PyTorch
☆246Updated last year
YuvrajSingh-mist / Paper-Replications
A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch
☆383Updated 3 weeks ago
Pints-AI / 1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
☆330Updated 6 months ago
rkinas / triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆421Updated 7 months ago
rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆211Updated last year
ServiceNow / Fast-LLM
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
☆254Updated last week
silvaxxx1 / MyLLM
"LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"
☆134Updated 3 weeks ago
facebookresearch / LayerSkip
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
☆344Updated 5 months ago