Artefact2 / llm-evalLinks
A super simple web interface to perform blind tests on LLM outputs.
☆28Updated last year
Alternatives and similar repositories for llm-eval
Users that are interested in llm-eval are comparing it to the libraries listed below
Sorting:
- Scripts to create your own moe models using mlx☆90Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- inference code for mixtral-8x7b-32kseqlen☆101Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆31Updated 7 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Implementation of nougat that focuses on processing pdf locally.☆81Updated 6 months ago
- ☆38Updated last year
- Embedding models from Jina AI☆62Updated last year
- For inferring and serving local LLMs using the MLX framework☆107Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆168Updated last year
- LLaVA server (llama.cpp).☆181Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆140Updated 5 months ago
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆88Updated last year
- Tools for formatting large language model prompts.☆13Updated last year
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆47Updated last year
- ☆37Updated last year
- Command-line script for inferencing from models such as MPT-7B-Chat☆100Updated 2 years ago
- ☆157Updated last year
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆87Updated last month
- Python bindings for ggml☆142Updated 11 months ago
- Distributed Inference for mlx LLm☆94Updated last year
- Web browser version of StarCoder.cpp☆45Updated 2 years ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆87Updated last year
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year
- A guidance compatibility layer for llama-cpp-python☆35Updated last year
- ☆12Updated 10 months ago
- Full finetuning of large language models without large memory requirements☆94Updated last year
- ☆115Updated 7 months ago
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆49Updated 2 months ago
- GRDN.AI app for garden optimization☆70Updated last year