shreyashankar / spade-experimentsLinks
Experiments to assess SPADE on different LLM pipelines.
☆17Updated last year
Alternatives and similar repositories for spade-experiments
Users that are interested in spade-experiments are comparing it to the libraries listed below
Sorting:
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆30Updated 9 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆45Updated last year
- Mixing Language Models with Self-Verification and Meta-Verification☆111Updated last year
- ☆55Updated last year
- Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"☆66Updated 2 years ago
- ReLM is a Regular Expression engine for Language Models☆107Updated 2 years ago
- Aioli: A unified optimization framework for language model data mixing☆31Updated 11 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆66Updated last year
- ☆20Updated 2 weeks ago
- Finding semantically meaningful and accurate prompts.☆48Updated 2 years ago
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)☆79Updated 11 months ago
- ☆28Updated 8 months ago
- A library for simplifying training with multi gpu setups in the HuggingFace / PyTorch ecosystem.☆16Updated 3 weeks ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆43Updated 2 years ago
- A repository for research on medium sized language models.☆77Updated last year
- Advanced Reasoning Benchmark Dataset for LLMs☆47Updated 2 years ago
- Small, simple agent task environments for training and evaluation☆19Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Updated 3 months ago
- AI Evaluation Platform☆47Updated 7 months ago
- ☆26Updated 2 years ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 8 months ago
- Measuring and Controlling Persona Drift in Language Model Dialogs☆20Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆35Updated last year
- ☆48Updated 5 months ago
- [FORGE 2025] Graph-based method for end-to-end code completion with context awareness on repository☆71Updated last year
- ☆80Updated 9 months ago
- ☆49Updated 8 months ago
- ☆49Updated last year
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Updated last year