kilian-group / phantom-wiki
Python package for generating datasets to evaluate reasoning and retrieval of large language models
☆17Updated this week
Alternatives and similar repositories for phantom-wiki:
Users that are interested in phantom-wiki are comparing it to the libraries listed below
- Aioli: A unified optimization framework for language model data mixing☆23Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- Repository for Skill Set Optimization☆12Updated 8 months ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆18Updated last month
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆25Updated this week
- implementation of dualformer☆15Updated last month
- Efficient Scaling laws and collaborative pretraining.☆16Updated 2 months ago
- ☆17Updated 6 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Updated 2 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆23Updated last month
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆18Updated 3 weeks ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 10 months ago
- ☆27Updated last week
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆16Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated last year
- PyTorch implementation for MRL☆18Updated last year
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated last week
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- ☆41Updated 2 weeks ago
- ☆33Updated 10 months ago
- A Data Source for Reasoning Embodied Agents☆19Updated last year
- NeurIPS 2024 tutorial on LLM Inference☆41Updated 4 months ago
- ☆24Updated 7 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆43Updated last year
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆42Updated last year
- Exploring limitations of LLM-as-a-judge☆16Updated 8 months ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24Updated 2 years ago
- Minimum Description Length probing for neural network representations☆19Updated 2 months ago
- ☆27Updated 3 weeks ago