viai957 / llama-inferenceView external linksLinks
A simple implementation of Llama 1, 2. Llama Architecture built from scratch using PyTorch all the models are built from scratch that includes GQA (Grouped Query Attention) , RoPE (Rotary Positional Embeddings) , RMS Norm, FeedForward Block, Encoder (as this is only for Inferencing the model) , SwiGLU (Activation Function),
☆13May 6, 2024Updated last year
Alternatives and similar repositories for llama-inference
Users that are interested in llama-inference are comparing it to the libraries listed below
Sorting:
- yolosegment2labelme - a Python package that allows you to convert YOLO segmentation prediction results to LabelMe and anylabeling JSON fo…☆10May 8, 2024Updated last year
- ☆12Dec 14, 2024Updated last year
- ☆13Sep 12, 2024Updated last year
- Multi-Agent AI App from Scratch in python without any depedency of framework☆15Jan 7, 2025Updated last year
- Files used for the evaluation of uiCA☆18Dec 14, 2022Updated 3 years ago
- A barely barebone NumPy implementation of Hierarchical Temporal Memory.☆11Mar 26, 2023Updated 2 years ago
- A replication of the paper "Adaptive Mixtures of Local Experts" applied to the CIFAR-10 image classification dataset.☆12Mar 19, 2021Updated 4 years ago
- This repo implements Video generation model using Latent Diffusion Transformers(Latte) in PyTorch and provides training and inference cod…☆16Jan 6, 2025Updated last year
- A PyTorch implementation of Vector Quantized Variational Autoencoder (VQ-VAE) with EMA updates, pretrained encoder, and K-means initializ…☆20Dec 31, 2024Updated last year
- Simple CTC implementation for PyTorch☆14Oct 25, 2017Updated 8 years ago
- An HTTP Server for FPGAs☆16Sep 26, 2023Updated 2 years ago
- Fine-tuning large language models (LLMs) is crucial for enhancing performance across domain-specific task applications. This comprehensiv…☆12Sep 19, 2024Updated last year
- A straightforward explanation of how DeepSeek R1 works☆17Feb 7, 2025Updated last year
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆16Sep 13, 2024Updated last year
- YouTube Video Summarization App built using open source LLM and Framework like Llama 2, Haystack, Whisper, and Streamlit. This app smooth…☆22May 7, 2024Updated last year
- Simplistic Implementation of Zipformer:A faster and better encoder for automatic speech recognition in PyTorch☆18Jun 3, 2024Updated last year
- Refactoring contents and codes of CS20 : Tensorflow for Deep Learning Research☆62Jan 18, 2019Updated 7 years ago
- This repo implements and trains DallE-1 on a synthetically generated dataset which has colored mnist images on texture/solid background a…☆13Oct 30, 2024Updated last year
- Securing LLM's Against Top 10 OWASP Large Language Model Vulnerabilities 2024☆20May 10, 2024Updated last year
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆20Jun 29, 2024Updated last year
- ☆20Feb 2, 2025Updated last year
- An intelligent agent utilizing Large Language Models (LLMs) for automated financial news retrieval and stock price prediction.☆20Sep 9, 2024Updated last year
- PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition☆18Apr 25, 2021Updated 4 years ago
- This is a RAG implementation using Open Source stack. BioMistral 7B has been used to build this app along with PubMedBert as an embedding…☆20Jul 31, 2024Updated last year
- Landing repository for the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"☆41Sep 12, 2025Updated 5 months ago
- Conformer RNN-Transducer☆14May 25, 2022Updated 3 years ago
- A simple implementation of a deep linear Pytorch module☆21Oct 16, 2020Updated 5 years ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆21Nov 22, 2023Updated 2 years ago
- Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch☆29Jan 31, 2026Updated 2 weeks ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆26Nov 25, 2024Updated last year
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Sep 26, 2024Updated last year
- Datastructure for data science☆23Apr 12, 2024Updated last year
- Jax/Flax implementation of Denoising Diffusion Implicit Models☆20Jul 18, 2022Updated 3 years ago
- Perplexity Lite using Langgraph, Tavily, and GPT-4.☆25May 1, 2024Updated last year
- Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization☆39Feb 7, 2026Updated last week
- Updating collection of summarization datasets in 100+ languages, based on our paper "The State and Fate of Summarization Datasets: A Surv…☆29Apr 29, 2025Updated 9 months ago
- Pytorch implementation of "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions", ICASSP, 2018.☆19Jan 21, 2021Updated 5 years ago
- ☆39Sep 24, 2025Updated 4 months ago
- ☆46May 20, 2025Updated 8 months ago