InflectionAI / Inflection-Benchmarks
Public Inflection Benchmarks
☆69Updated 6 months ago
Related projects: ⓘ
- Code repository for the c-BTM paper☆105Updated 11 months ago
- ☆77Updated 3 weeks ago
- Experiments for efforts to train a new and improved t5☆76Updated 5 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆68Updated last year
- ☆29Updated 2 weeks ago
- ☆55Updated 9 months ago
- Functional Benchmarks and the Reasoning Gap☆74Updated last month
- Evaluating LLMs with CommonGen-Lite☆83Updated 5 months ago
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆81Updated last month
- Codebase accompanying the Summary of a Haystack paper.☆65Updated 2 months ago
- Just a bunch of benchmark logs for different LLMs☆112Updated last month
- ☆38Updated 5 months ago
- ☆40Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆39Updated 2 weeks ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆195Updated 3 months ago
- Multi-Domain Expert Learning☆67Updated 7 months ago
- A repository for research on medium sized language models.☆71Updated 3 months ago
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- Small, simple agent task environments for training and evaluation☆13Updated last week
- Language models scale reliably with over-training and on downstream tasks☆91Updated 5 months ago
- Code for Zero-Shot Tokenizer Transfer☆109Updated 2 months ago
- ☆48Updated 11 months ago
- Scripts for generating synthetic finetuning data for reducing sycophancy.☆105Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆44Updated 8 months ago
- Can Language Models Solve Olympiad Programming?☆92Updated last month
- Mixing Language Models with Self-Verification and Meta-Verification☆96Updated 10 months ago
- ☆50Updated last month
- Data preparation code for Amber 7B LLM☆76Updated 4 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- Experiments on speculative sampling with Llama models☆114Updated last year