ctlllll/understanding_llm_benchmarks

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ctlllll/understanding_llm_benchmarks)

ctlllll / understanding_llm_benchmarks

Understanding the correlation between different LLM benchmarks

☆30

Alternatives and similar repositories for understanding_llm_benchmarks

Users that are interested in understanding_llm_benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mlabonne / tinytuner
View on GitHub
🐜🔧 A minimalistic tool to fine-tune your LLMs
☆19Aug 17, 2023Updated 2 years ago
gauss5930 / iDUS
View on GitHub
An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.
☆14Mar 20, 2024Updated 2 years ago
fblgit / model-similarity
View on GitHub
Simple Model Similarities Analysis
☆21Feb 3, 2024Updated 2 years ago
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
SmallDoges / small-datasets
View on GitHub
Distill thinking dataset more compactly and accurately!
☆38Jun 6, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
SebastianBodza / EnsembleForecasting
View on GitHub
Using multiple LLMs for ensemble Forecasting
☆16Jan 17, 2024Updated 2 years ago
dmahan93 / lm-evaluation-harness
View on GitHub
A framework for few-shot evaluation of autoregressive language models.
☆16Aug 23, 2023Updated 2 years ago
RuntianZ / adversarial-robustness-unlabeled
View on GitHub
Adversarially Robust Generalization Just Requires More Unlabeled Data
☆11Aug 8, 2019Updated 6 years ago
Pleias / Various-Finetuning
View on GitHub
Set of scripts to finetune LLMs
☆38Mar 30, 2024Updated 2 years ago
samchaineau / llm_slerp_generation
View on GitHub
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆37Oct 9, 2025Updated 9 months ago
zarakiquemparte / zaraki-tools
View on GitHub
☆28Aug 30, 2023Updated 2 years ago
mikolajbadyl / flutter_leap_sdk
View on GitHub
A Flutter plugin for integrating Liquid AI's LEAP SDK, enabling on-device deployment of small language models in Flutter applications.
☆24Sep 3, 2025Updated 10 months ago
mlabonne / chessllm
View on GitHub
☆47Jan 24, 2024Updated 2 years ago
thuml / learn_torch.compile
View on GitHub
torch.compile artifacts for common deep learning models, can be used as a learning resource for torch.compile
☆19Dec 22, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
danaesavi / ImageChain
View on GitHub
This repository is associated with the research paper titled ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large…
☆15Jun 4, 2025Updated last year
lil-lab / cb2
View on GitHub
An NLP research and data collection platform.
☆17Jul 4, 2026Updated 3 weeks ago
jbilcke-hf / template-node-wizardcoder-express
View on GitHub
A minimalist Docker project to help people getting started with Node, WizardCoder, CTransformers, Python, Express and TypeScript. Ready t…
☆14Jun 23, 2023Updated 3 years ago
cloneofsimo / fim-llama-deepspeed
View on GitHub
☆33Jan 1, 2024Updated 2 years ago
jetfontanilla / azure-viseme-json
View on GitHub
Example code on how to generate viseme json
☆14Feb 23, 2023Updated 3 years ago
Alignment-Lab-AI / Dataset-Conversion-Toolkit
View on GitHub
a set of scripts to easily convert all training data from huggingface into alpaca instruct or sharegpt format, which should allow for eas…
☆20Mar 14, 2025Updated last year
ErikKaum / runner
View on GitHub
Experimental wasm32-unknown-wasi runtime for Python code execution
☆40Nov 28, 2024Updated last year
immortal3 / KV-picoGPT
View on GitHub
An unnecessarily tiny and minimal implementation of GPT-2 in NumPy.
☆11Feb 12, 2023Updated 3 years ago
DEFI-COLaF / LADaS
View on GitHub
Layout Analysis Dataset with Segmonto (LADaS)
☆25May 29, 2026Updated last month
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
JiazhengZhang / AgentV-RL
View on GitHub
☆15Apr 17, 2026Updated 3 months ago
facebookresearch / Data_Acquisition_for_ML_Benchmark
View on GitHub
DAM Data Acquisition for ML Benchmark, as part of the DataPerf benchmark suite, https://dataperf.org/
☆26May 25, 2023Updated 3 years ago
janphilippfranken / sami
View on GitHub
Self-Supervised Alignment with Mutual Information
☆20May 24, 2024Updated 2 years ago
vibrantlabsai / Funtuner
View on GitHub
Supervised instruction finetuning for LLM with HF trainer and Deepspeed
☆37Jul 6, 2023Updated 3 years ago
kuzudb / dspy-kuzu-demo
View on GitHub
Intro to using DSPy with Kuzu to enrich the data within the Nobel Laureate mentorship network
☆16Sep 16, 2025Updated 10 months ago
Nicolas-Yax / PhyloLM
View on GitHub
Genetics for Language Models
☆18Jul 1, 2024Updated 2 years ago
arcee-ai / DAM
View on GitHub
☆56Nov 6, 2024Updated last year
r-three / mats
View on GitHub
☆33Jul 8, 2024Updated 2 years ago
EQ-bench / EQ-Bench
View on GitHub
A benchmark for emotional intelligence in large language models
☆444Jul 26, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
SumanthRH / tokenization
View on GitHub
A comprehensive deep dive into the world of tokens
☆231Jun 24, 2024Updated 2 years ago
lyuchenyang / Document-level-Sentiment-Analysis-with-User-and-Product-Context
View on GitHub
Code for COLING 2020 paper "Improving Document-level Sentiment Analysis with User and Product Context"
☆11Apr 13, 2022Updated 4 years ago
jupyter / try-jupyter
View on GitHub
A JupyterLite deployment to try JupyterLab, Jupyter Notebook and IPython in the browser
☆13Jul 2, 2026Updated 3 weeks ago
tanaymeh / mamba-train
View on GitHub
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆63Apr 8, 2024Updated 2 years ago
vita-epfl / motion-style-transfer
View on GitHub
[CoRL22] Motion Style Transfer: Modular Low-Rank Adaptation for Deep Motion Forecasting
☆22Dec 6, 2022Updated 3 years ago
imoneoi / multipack
View on GitHub
Multipack distributed sampler for fast padding-free training of LLMs
☆207Aug 10, 2024Updated last year
jwjohns / LFM2Sloth
View on GitHub
Modular task agnostic training pipeline using LFM2 from Liquid AI with unsloth.
☆16Sep 13, 2025Updated 10 months ago