strangeloopcanon / LOOP-EvalsLinks
Logical Operations On Puzzles: Simple Iterative Reasoning Tests for LLMs first through wordgrids
☆17Updated 4 months ago
Alternatives and similar repositories for LOOP-Evals
Users that are interested in LOOP-Evals are comparing it to the libraries listed below
Sorting:
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Updated 5 months ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆26Updated 6 months ago
- Minimum Description Length probing for neural network representations☆18Updated 5 months ago
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated 2 years ago
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Updated last year
- PyTorch implementation for MRL☆18Updated last year
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- Embedding Recycling for Language models☆38Updated last year
- assign color hues to a collection of text fragments based on embeddings☆20Updated last year
- A library for squeakily cleaning and filtering language datasets.☆47Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- ☆44Updated 7 months ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- ☆14Updated last year
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- One stop shop for all things carp☆59Updated 2 years ago
- ☆35Updated 2 years ago
- ☆31Updated last year
- ☆18Updated last year
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆40Updated 3 months ago
- A dataset of alignment research and code to reproduce it☆77Updated 2 years ago
- This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.☆20Updated last year
- Finding semantically meaningful and accurate prompts.☆46Updated last year
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 2 months ago
- ☆61Updated last year
- Understanding how features learned by neural networks evolve throughout training☆35Updated 8 months ago
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆16Updated last year
- ☆24Updated 9 months ago