ntunlp / OpenSource-LLMs-better-than-OpenAILinks

Listing all reported open-source LLMs achieving a higher score than proprietary, paying OpenAI models (ChatGPT, GPT-4).

☆70

Alternatives and similar repositories for OpenSource-LLMs-better-than-OpenAI

Users that are interested in OpenSource-LLMs-better-than-OpenAI are comparing it to the libraries listed below

Sorting:

LLM360 / Analysis360
Open Implementations of LLM Analyses
☆105Updated 9 months ago
Re-Align / just-eval
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
☆85Updated last year
Anni-Zou / Meta-CoT
Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models
☆97Updated last year
GasolSun36 / Iter-CoT
[NAACL 2024] Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models
☆85Updated last year
GAIR-NLP / Entropy-ABF
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
☆79Updated last year
kyegomez / Algorithm-Of-Thoughts
My implementation of "Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models"
☆98Updated last year
GAIR-NLP / scaleeval
Scalable Meta-Evaluation of LLMs as Evaluators
☆42Updated last year
xlang-ai / batch-prompting
[EMNLP 2023 Industry Track] A simple prompting approach that enables the LLMs to run inference in batches.
☆74Updated last year
18907305772 / FuseAI
FuseAI Project
☆87Updated 5 months ago
THU-KEG / ChatLog
⏳ ChatLog: Recording and Analysing ChatGPT Across Time
☆100Updated last year
shizhediao / Post-Training-Data-Flywheel
We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.
☆57Updated 9 months ago
zorazrw / awesome-tool-llm
☆234Updated 11 months ago
HowieHwong / MetaTool
[ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
☆89Updated last year
lifan-yuan / CRAFT
Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"
☆57Updated last year
sambanova / toolbench
ToolBench, an evaluation suite for LLM tool manipulation capabilities.
☆154Updated last year
Open-Source-O1 / o1_Reasoning_Patterns_Study
☆102Updated 7 months ago
sail-sg / sailcraft
🚢 Data Toolkit for Sailor Language Models
☆94Updated 4 months ago
ryokamoi / llm-self-correction-papers
List of papers on Self-Correction of LLMs.
☆73Updated 6 months ago
GAIR-NLP / AIME-Preview
☆71Updated 4 months ago
GAIR-NLP / ReAlign
Reformatted Alignment
☆113Updated 9 months ago
dwzhu-pku / LongEmbed
LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)
☆138Updated 8 months ago
clinicalml / co-llm
Co-LLM: Learning to Decode Collaboratively with Multiple Language Models
☆116Updated last year
rxlqn / awesome-llm-self-reflection
augmented LLM with self reflection
☆129Updated last year
declare-lab / flacuna
Flacuna was developed by fine-tuning Vicuna on Flan-mini, a comprehensive instruction collection encompassing various tasks. Vicuna is al…
☆111Updated last year
reasoning-machines / prompt-lib
A set of utilities for running few-shot prompting experiments on large-language models
☆122Updated last year
FranxYao / GPT-Bargaining
Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback
☆207Updated 2 years ago
ZongqianLi / 500xCompressor
[ACL 2025 Main] Repository for the paper: 500xCompressor: Generalized Prompt Compression for Large Language Models
☆40Updated last month
abhika-m / FAVA
☆72Updated last year
uclaml / Rephrase-and-Respond
Official repo of Respond-and-Respond: data, code, and evaluation
☆103Updated 11 months ago
WooooDyy / Self-Polish
Codes for the EMNLP 2023 Findings paper "Self-Polish: Enhance Reasoning in Large Language Models via Problem Refining" by Zhiheng Xi, Sen…
☆30Updated 2 years ago