marquisdepolis / LOOP-Evals

Logical Operations On Puzzles: Simple Iterative Reasoning Tests for LLMs first through wordgrids

☆16

Related projects ⓘ

Alternatives and complementary repositories for LOOP-Evals

EleutherAI / features-across-time
Understanding how features learned by neural networks evolve throughout training
☆31Updated 3 weeks ago
kumar-shridhar / Screws
SCREWS: A Modular Framework for Reasoning with Revisions
☆26Updated last year
weaviate / biggraph-wikidata-search-with-weaviate
Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine
☆31Updated 2 years ago
ExtensityAI / benchmark
Evaluation of neuro-symbolic engines
☆33Updated 3 months ago
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆16Updated last week
krypticmouse / matryoshka-representation-learning
PyTorch implementation for MRL
☆18Updated 9 months ago
tomerwolgithub / question-decomposition-to-sql
Weakly Supervised Text-to-SQL Parsing through Question Decomposition
☆22Updated last year
Knowledgator / utca
Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…
☆23Updated 3 months ago
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated 5 months ago
allenai / EmbeddingRecycling
Embedding Recycling for Language models
☆38Updated last year
alon-albalak / FLAD
Few-shot Learning with Auxiliary Data
☆26Updated 11 months ago
google-research-datasets / swim-ir
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆44Updated last year
oughtinc / primer
Factored Cognition Primer: How to write compositional language model programs
☆48Updated last year
DerwenAI / textgraphs
TextGraphs + LLMs + graph ML for entity extraction, linking, ranking, and constructing a lemma graph
☆20Updated 8 months ago
para-lost / ReBase
ReBase: Training Task Experts through Retrieval Based Distillation
☆27Updated 4 months ago
benpry / chain-of-thought-metaphor
This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…
☆14Updated last year
gonglinyuan / metro_t0
Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)
☆22Updated last year
plastic-labs / dspy-opentom
Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset
☆13Updated 8 months ago
Zyphra / zcookbook
Training hybrid models for dummies.
☆15Updated 3 weeks ago
bilal-chughtai / rep-theory-mech-interp
☆26Updated last year
TristanThrush / i-am-a-strange-dataset
Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"
☆39Updated 10 months ago
allenai / SciRIFF
Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.
☆28Updated 2 weeks ago
allenai / smashed
SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…
☆31Updated 5 months ago
IBM / model-recycling
Ranking of fine-tuned HF models as base models.
☆35Updated last year
YuchenJin / llm.c
LLM training in simple, raw C/CUDA
☆12Updated last month
jmerullo / lm_vector_arithmetic
☆28Updated last year
MurtyShikhar / LanguagePatching
Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"
☆19Updated last year
ltgoslo / gpt-bert
Official implementation of "GPT or BERT: why not both?"
☆36Updated last week
allenai / sso
Repository for Skill Set Optimization
☆12Updated 3 months ago
sher222 / LeReT
Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
☆24Updated 3 weeks ago