Inference script for Meta's LLaMA models using Hugging Face wrapper
☆109Mar 24, 2023Updated 3 years ago
Alternatives and similar repositories for llama_infer
Users that are interested in llama_infer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository contains the files used for our Interspeech 2017 paper.☆16May 30, 2017Updated 9 years ago
- A model compression and acceleration toolbox based on pytorch.☆331Jan 12, 2024Updated 2 years ago
- GPTQ inference TVM kernel☆40Apr 25, 2024Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Aug 25, 2023Updated 2 years ago
- ☆456Oct 15, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Cheat sheets for Math and CS studies.☆17Feb 8, 2021Updated 5 years ago
- ☆27Jan 17, 2026Updated 5 months ago
- ☆23Feb 26, 2024Updated 2 years ago
- Handy tools & graphics API abstraction for blazing fast prototyping☆10Jan 17, 2024Updated 2 years ago
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP☆25Dec 28, 2023Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Jun 3, 2024Updated 2 years ago
- Codebase for the EMNLP 2021 paper "HittER: Hierarchical Transformers for Knowledge Graph Embeddings".☆12Nov 1, 2021Updated 4 years ago
- ☆16Apr 14, 2021Updated 5 years ago
- What Has Been Enhanced in my Knowledge-Enhanced Language Model?☆13Oct 26, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Benchmark tests supporting the TiledCUDA library.☆19Nov 19, 2024Updated last year
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- A python library to find differences between audio and transcriptions☆20Nov 14, 2023Updated 2 years ago
- [EMNLP 2023] Official repository for Dialogue Chain-of-Thought Distillation (DONUT & DOCTOR)☆11Nov 15, 2023Updated 2 years ago
- ☆98Jun 4, 2024Updated 2 years ago
- GPTQ inference Triton kernel☆322May 18, 2023Updated 3 years ago
- MegEngine build with cu11x☆17Mar 13, 2023Updated 3 years ago
- ☆23Nov 6, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆15Jul 25, 2024Updated last year
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆28Apr 9, 2024Updated 2 years ago
- Visual search interface☆11Nov 30, 2021Updated 4 years ago
- ☆17Apr 14, 2023Updated 3 years ago
- Python toolbox to load, parse and process Official Journals of the European Union (EU).☆24May 3, 2024Updated 2 years ago
- A list of Numerical Multimodal reasoning papers and their implementation☆11May 13, 2024Updated 2 years ago
- Fork of the awesome function_parser library from Github's CodeSearchNet Challenge repo: https://github.com/github/CodeSearchNet/tree/mast…☆30Feb 26, 2023Updated 3 years ago
- ☆897Jun 7, 2023Updated 3 years ago
- The brain of my Screeps army!☆15Dec 6, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Jan 17, 2024Updated 2 years ago
- ☆14Oct 6, 2025Updated 8 months ago
- Frontend (and soon also midleware and backend) for a new, opensource image generation platform.☆14Nov 5, 2022Updated 3 years ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆18May 29, 2023Updated 3 years ago
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆13Nov 23, 2024Updated last year
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Jun 5, 2026Updated 2 weeks ago
- argparse extension for hpman☆17Dec 4, 2022Updated 3 years ago