mkuchnik / relm
ReLM is a Regular Expression engine for Language Models
☆104Updated last year
Alternatives and similar repositories for relm:
Users that are interested in relm are comparing it to the libraries listed below
- Mixing Language Models with Self-Verification and Meta-Verification☆104Updated 4 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Evaluating LLMs with CommonGen-Lite☆90Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated 9 months ago
- Functional Benchmarks and the Reasoning Gap☆85Updated 7 months ago
- experiments with inference on llama☆104Updated 11 months ago
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆220Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Retrieval Augmented Generation Generalized Evaluation Dataset☆53Updated 5 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆100Updated last year
- ☆73Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated last year
- Code repository for the c-BTM paper☆106Updated last year
- Experiments on speculative sampling with Llama models☆125Updated last year
- Fast & more realistic evaluation of chat language models. Includes leaderboard.☆186Updated last year
- ☆24Updated last year
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆69Updated 4 months ago
- Multi-Domain Expert Learning☆67Updated last year
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System☆118Updated 10 months ago
- A set of utilities for running few-shot prompting experiments on large-language models☆120Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)☆78Updated 3 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Advanced Reasoning Benchmark Dataset for LLMs☆45Updated last year
- A framework for evaluating function calls made by LLMs☆37Updated 9 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆107Updated 7 months ago
- [ICLR 2024] Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation☆167Updated last year
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- Evaluating LLMs with fewer examples☆151Updated last year
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning☆46Updated last year