VILA-Lab / ATLAS
A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171
☆957Updated 10 months ago
Alternatives and similar repositories for ATLAS:
Users that are interested in ATLAS are comparing it to the libraries listed below
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,829Updated 8 months ago
- Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".☆595Updated last week
- FacTool: Factuality Detection in Generative AI☆865Updated 8 months ago
- ☆331Updated 10 months ago
- Automated Evaluation of RAG Systems☆579Updated 3 weeks ago
- ☆436Updated 6 months ago
- Automatically evaluate your LLMs in Google Colab☆615Updated 11 months ago
- ☆1,250Updated 11 months ago
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,042Updated last month
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆935Updated 6 months ago
- Using Tree-of-Thought Prompting to boost ChatGPT's reasoning☆756Updated last year
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"☆816Updated 3 weeks ago
- Code for Quiet-STaR☆730Updated 8 months ago
- Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.☆1,051Updated last month
- The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.☆754Updated 11 months ago
- LLMs can generate feedback on their work, use it to improve the output, and repeat this process iteratively.☆679Updated 6 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,391Updated 3 weeks ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,246Updated 2 months ago
- ☆867Updated 5 months ago
- Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'☆1,493Updated 3 months ago
- A unified evaluation framework for large language models☆2,597Updated this week
- HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels☆530Updated 4 months ago
- Open-source tool to visualise your RAG 🔮☆1,122Updated 3 months ago
- This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and bench…☆586Updated last year
- Efficient Retrieval Augmentation and Generation Framework☆1,523Updated 3 months ago
- Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".☆1,129Updated last year
- ☆945Updated 8 months ago
- Evaluate your LLM's response with Prometheus and GPT4 💯☆911Updated last month
- A joint community effort to create one central leaderboard for LLMs.☆295Updated 8 months ago
- A library for advanced large language model reasoning☆2,099Updated 2 weeks ago