TheDuckAI/arb

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TheDuckAI/arb)

TheDuckAI / arb

Advanced Reasoning Benchmark Dataset for LLMs

☆48

Alternatives and similar repositories for arb

Users that are interested in arb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EleutherAI / pile_dedupe
View on GitHub
Pile Deduplication Code
☆18May 15, 2023Updated 3 years ago
r-three / RAD
View on GitHub
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆45Oct 1, 2025Updated 9 months ago
opencog / spacetime
View on GitHub
Save, track and query 3D+time locations of objects in the AtomSpace
☆16Apr 3, 2025Updated last year
kaistAI / FLASK
View on GitHub
[ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
☆218Dec 24, 2023Updated 2 years ago
runame / laplace-refinement
View on GitHub
Posterior Refinement Improves Sample Efficiency in Bayesian Neural Networks
☆11Oct 21, 2022Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
RUCBM / LeaF
View on GitHub
☆14Nov 2, 2025Updated 8 months ago
IBM / fmwork
View on GitHub
Tools and pipelines for automated LLM performance evaluation
☆15May 20, 2026Updated 2 months ago
salesforce / factualNLG
View on GitHub
Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"
☆60Jun 2, 2026Updated last month
kyegomez / MobileVLM
View on GitHub
Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …
☆15Mar 11, 2024Updated 2 years ago
The-Swarm-Corporation / AgentParse
View on GitHub
AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…
☆18Oct 13, 2025Updated 9 months ago
The-Swarm-Corporation / Brainwave
View on GitHub
Brainwave is a state-of-the-art neural decoder that transforms electroencephalogram (EEG) and brain signals into multimodal outputs inclu…
☆14Oct 6, 2025Updated 9 months ago
kyegomez / SoundStream
View on GitHub
Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"
☆13Jan 27, 2025Updated last year
kyegomez / Pairformer
View on GitHub
Implementation of the Pairformer model used in AlphaFold 3
☆14Jul 13, 2026Updated last week
kevinyaobytedance / llm_eval
View on GitHub
LLM evaluation.
☆16Nov 7, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
casmlab / NPHardEval
View on GitHub
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆64Mar 26, 2024Updated 2 years ago
kyegomez / TinyGPTV
View on GitHub
Simple Implementation of TinyGPTV in super simple Zeta lego blocks
☆16Nov 11, 2024Updated last year
kyegomez / OmniByteFormer
View on GitHub
OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…
☆15Jul 13, 2026Updated last week
ninyawee / pythainav
View on GitHub
a Pythonic interface to pull thai mutual fund NAV
☆16Jun 5, 2026Updated last month
kyegomez / Audio-xLSTMs
View on GitHub
Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch
☆19Jul 13, 2026Updated last week
DT6A / ReBRAC
View on GitHub
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
☆19Oct 22, 2023Updated 2 years ago
Helsinki-NLP / OPUS-MT-testsets
View on GitHub
benchmarks for evaluating MT models
☆11Jun 26, 2024Updated 2 years ago
snap-research / LargeGT
View on GitHub
Graph Transformers for Large Graphs
☆22Apr 26, 2024Updated 2 years ago
kyegomez / TeraGPT
View on GitHub
Train a production grade GPT in less than 400 lines of code. Better than Karpathy's verison and GIGAGPT
☆17Jul 13, 2026Updated last week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
BaohaoLiao / RSD
View on GitHub
[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.
☆56May 2, 2025Updated last year
asahi417 / relbert
View on GitHub
The official implementation of "Distilling Relation Embeddings from Pre-trained Language Models, EMNLP 2021 main conference", a high-qual…
☆48Dec 2, 2024Updated last year
ZackBradshaw / ikigAI
View on GitHub
☆13Mar 28, 2024Updated 2 years ago
kyegomez / CogNetX
View on GitHub
CogNetX is an advanced, multimodal neural network architecture inspired by human cognition. It integrates speech, vision, and video proce…
☆20Updated this week
tongzhou21 / Oasis
View on GitHub
☆23Aug 7, 2023Updated 2 years ago
facebookresearch / BELA
View on GitHub
Bi-encoder entity linking architecture
☆52Sep 10, 2024Updated last year
zorazrw / odex
View on GitHub
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆49Dec 22, 2023Updated 2 years ago
kyegomez / Tiktokx
View on GitHub
Tiktok is an advanced multimedia recommender system that fuses the generative modality-aware collaborative self-augmentation and contrast…
☆14Aug 18, 2023Updated 2 years ago
kyegomez / HSSS
View on GitHub
Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…
☆16Nov 11, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
kyegomez / autogpt-tot
View on GitHub
Simple Autogpt with tree of thoughts
☆14May 25, 2023Updated 3 years ago
zjunlp / DocED
View on GitHub
[ACL 2021] MLBiNet: A Cross-Sentence Collective Event Detection Network
☆35Jan 10, 2022Updated 4 years ago
kyegomez / EAOT
View on GitHub
The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"
☆19Mar 11, 2024Updated 2 years ago
kyegomez / forest-of-thoughts
View on GitHub
A forest of autonomous agents.
☆20Jan 27, 2025Updated last year
Zce1112zslx / IKE
View on GitHub
☆41Nov 30, 2023Updated 2 years ago
akoksal / LongForm
View on GitHub
Reverse Instructions to generate instruction tuning data with corpus examples
☆215Mar 5, 2024Updated 2 years ago
codetlingua / codetlingua
View on GitHub
☆18Apr 15, 2024Updated 2 years ago