lunary-ai / llm-benchmarks
LLM benchmarks
☆13Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for llm-benchmarks
- Track the progress of LLM context utilisation☆53Updated 4 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated last week
- Public Inflection Benchmarks☆69Updated 8 months ago
- Reasoning by Communicating with Agents☆21Updated last month
- ☆57Updated 11 months ago
- ☆22Updated last year
- Understanding the correlation between different LLM benchmarks☆29Updated 10 months ago
- Score LLM pretraining data with classifiers☆54Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Advanced Reasoning Benchmark Dataset for LLMs☆45Updated last year
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆37Updated last year
- Chat Markup Language conversation library☆54Updated 10 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated 10 months ago
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated 10 months ago
- ☆40Updated last month
- ☆21Updated 11 months ago
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆27Updated 10 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆24Updated 3 weeks ago
- Implementation of Spectral State Space Models☆17Updated 8 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆37Updated 7 months ago
- LLMs as Collaboratively Edited Knowledge Bases☆43Updated 9 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆37Updated 5 months ago
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆15Updated 3 weeks ago
- ☆41Updated 2 weeks ago
- ☆43Updated last year
- ☆36Updated 3 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆33Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year