LLM360 / Analysis360
Open Implementations of LLM Analyses
☆102Updated 5 months ago
Alternatives and similar repositories for Analysis360:
Users that are interested in Analysis360 are comparing it to the libraries listed below
- Data preparation code for Amber 7B LLM☆86Updated 10 months ago
- Pre-training code for CrystalCoder 7B LLM☆55Updated 10 months ago
- Codebase accompanying the Summary of a Haystack paper.☆75Updated 5 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 10 months ago
- ☆214Updated 6 months ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆130Updated 4 months ago
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆75Updated last year
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Updated 11 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆103Updated 6 months ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆68Updated 3 weeks ago
- 🚢 Data Toolkit for Sailor Language Models☆87Updated 2 weeks ago
- ☆74Updated last year
- Evaluating LLMs with fewer examples☆147Updated 11 months ago
- RepoQA: Evaluating Long-Context Code Understanding☆105Updated 4 months ago
- ☆119Updated 5 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆73Updated 4 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆117Updated last year
- ☆74Updated last year
- Code accompanying "How I learned to start worrying about prompt formatting".☆102Updated 5 months ago
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System☆108Updated 9 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 8 months ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆33Updated last year
- Benchmark baseline for retrieval qa applications☆103Updated 10 months ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆149Updated last year
- FuseAI Project☆83Updated last month
- The official repo for "LLoCo: Learning Long Contexts Offline"☆114Updated 8 months ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆57Updated 11 months ago