lastmile-ai / llama-retrieval-plugin
LLaMa retrieval plugin script using OpenAI's retrieval plugin
β324Updated 2 years ago
Alternatives and similar repositories for llama-retrieval-plugin:
Users that are interested in llama-retrieval-plugin are comparing it to the libraries listed below
- C++ implementation for π«StarCoderβ453Updated last year
- β459Updated last year
- Extend the original llama.cpp repo to support redpajama model.β117Updated 7 months ago
- β405Updated 2 years ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRAβ123Updated last year
- SoTA Transformers with C-backend for fast inference on your CPU.β311Updated last year
- Reflexion: an autonomous agent with dynamic memory and self-reflectionβ385Updated last year
- Run Alpaca LLM in LangChainβ218Updated last year
- A command-line interface to generate textual and conversational datasets with LLMs.β296Updated last year
- This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such asβ¦β351Updated last year
- OpenAI-compatible Python client that can call any LLMβ370Updated last year
- A school for camelidsβ1,209Updated last year
- β535Updated last year
- Local LLM ReAct Agent with Guidanceβ158Updated last year
- OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMAβ301Updated last year
- β277Updated last year
- This project is an attempt to create a common metric to test LLM's for progress in eliminating hallucinations which is the most serious cβ¦β222Updated 2 years ago
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructionsβ821Updated last year
- Reimplementation of the task generation part from the Alpaca paperβ119Updated 2 years ago
- Command-line script for inferencing from models such as falcon-7b-instructβ76Updated last year
- React app implementing OpenAI and Google APIs to re-create behavior of the toolformer paper.β233Updated 2 years ago
- Instruct-tuning LLaMA on consumer hardwareβ66Updated 2 years ago
- fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backeβ¦β410Updated last year
- β218Updated 2 years ago
- C++ implementation for BLOOMβ809Updated last year
- β269Updated 2 years ago
- Falcon LLM ggml framework with CPU and GPU supportβ246Updated last year
- β591Updated last year
- An Autonomous LLM Agent that runs on Wizcoder-15Bβ335Updated 6 months ago
- Tune any FALCON in 4-bitβ466Updated last year