tinybirdco / llm-benchmarkLinks

We assessed the ability of popular LLMs to generate accurate and efficient SQL from natural language prompts. Using a 200 million record dataset from the GH Archive uploaded to Tinybird, we asked the LLMs to generate SQL based on 50 prompts.
44Updated 2 weeks ago

Alternatives and similar repositories for llm-benchmark

Users that are interested in llm-benchmark are comparing it to the libraries listed below

Sorting: