jorahn / llama-int8
Quantized inference code for LLaMA models
☆13Updated last year
Alternatives and similar repositories for llama-int8:
Users that are interested in llama-int8 are comparing it to the libraries listed below
- A library for incremental loading of large PyTorch checkpoints☆56Updated last year
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆28Updated last year
- Benchmark that evaluates LLMs using 436 NYT Connections puzzles☆23Updated this week
- Local LLM inference & management server with built-in OpenAI API☆31Updated 9 months ago
- Experimental sampler to make LLMs more creative☆30Updated last year
- Conversational Language model toolkit for training against human preferences.☆40Updated 9 months ago
- Github repo for Peifeng's internship project☆13Updated last year
- ☆28Updated last year
- This benchmark tests how well LLMs incorporate a set of 10 mandatory story elements (characters, objects, core concepts, attributes, moti…☆38Updated last week
- ☆26Updated last year
- Training hybrid models for dummies.☆18Updated 2 weeks ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- Simple LLM inference server☆20Updated 7 months ago
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆56Updated 3 years ago
- ☆40Updated last year
- ☆27Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆29Updated 3 weeks ago
- Editor with LLM generation tree exploration☆10Updated this week
- GGML implementation of BERT model with Python bindings and quantization.☆53Updated 11 months ago
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆111Updated last year
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆25Updated last week
- ☆48Updated last year
- Simple setup to self-host LLaMA3-70B model with an OpenAI API☆20Updated 9 months ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆12Updated 5 months ago
- ☆27Updated 5 months ago
- LLM inference in C/C++☆20Updated 3 months ago
- Full finetuning of large language models without large memory requirements☆93Updated last year
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated last year