marquisdepolis / LOOP-Evals
Logical Operations On Puzzles: Simple Iterative Reasoning Tests for LLMs first through wordgrids
☆16Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for LOOP-Evals
- Understanding how features learned by neural networks evolve throughout training☆31Updated 3 weeks ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆26Updated last year
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 2 years ago
- Evaluation of neuro-symbolic engines☆33Updated 3 months ago
- Minimum Description Length probing for neural network representations☆16Updated last week
- PyTorch implementation for MRL☆18Updated 9 months ago
- Weakly Supervised Text-to-SQL Parsing through Question Decomposition☆22Updated last year
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…☆23Updated 3 months ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 5 months ago
- Embedding Recycling for Language models☆38Updated last year
- Few-shot Learning with Auxiliary Data☆26Updated 11 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆44Updated last year
- Factored Cognition Primer: How to write compositional language model programs☆48Updated last year
- TextGraphs + LLMs + graph ML for entity extraction, linking, ranking, and constructing a lemma graph☆20Updated 8 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆27Updated 4 months ago
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated last year
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Updated last year
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆13Updated 8 months ago
- Training hybrid models for dummies.☆15Updated 3 weeks ago
- ☆26Updated last year
- Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"☆39Updated 10 months ago
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆28Updated 2 weeks ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆31Updated 5 months ago
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- LLM training in simple, raw C/CUDA☆12Updated last month
- ☆28Updated last year
- Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"☆19Updated last year
- Official implementation of "GPT or BERT: why not both?"☆36Updated last week
- Repository for Skill Set Optimization☆12Updated 3 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆24Updated 3 weeks ago