amodm / quantization-intro
Introduction to Quantization
☆20Updated last year
Alternatives and similar repositories for quantization-intro:
Users that are interested in quantization-intro are comparing it to the libraries listed below
- ☆41Updated last year
- LLM_library is a comprehensive repository serves as a one-stop resource hands-on code, insightful summaries.☆69Updated last year
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…☆23Updated last year
- experiments with inference on llama☆104Updated 10 months ago
- batched loras☆341Updated last year
- Code for NeurIPS LLM Efficiency Challenge☆57Updated last year
- ☆16Updated last year
- ☆94Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆255Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated last year
- Functional local implementations of main model parallelism approaches☆95Updated 2 years ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆106Updated 7 months ago
- Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"☆59Updated 6 months ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆198Updated 11 months ago
- Understanding large language models☆116Updated 2 years ago
- IndicGenBench is a high-quality, multilingual, multi-way parallel benchmark for evaluating Large Language Models (LLMs) on 4 user-facing …☆46Updated 7 months ago
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆93Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- A repository to perform self-instruct with a model on HF Hub☆32Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated last year
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆106Updated 6 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆130Updated last year
- ☆28Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆231Updated 5 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated 4 months ago
- Presents comprehensive benchmarks of XLA-compatible pre-trained models in Keras.☆37Updated last year
- Interview Questions and Answers for Machine Learning Engineer role☆119Updated 2 years ago
- ML Research paper summaries, annotated papers and implementation walkthroughs☆114Updated 3 years ago
- Various transformers for FSDP research☆37Updated 2 years ago
- Training and Inference Notebooks for the RedPajama (OpenLlama) models☆18Updated last year