carlini / yet-another-applied-llm-benchmark
A benchmark to evaluate language models on questions I've previously asked them to solve.
☆871Updated this week
Related projects: ⓘ
- Curate better data for LLMs☆934Updated 6 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆1,396Updated this week
- LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processin…☆659Updated this week
- Fine-tune mistral-7B on 3090s, a100s, h100s☆701Updated 11 months ago
- Automatically evaluate your LLMs in Google Colab☆511Updated 4 months ago
- ReFT: Representation Finetuning for Language Models☆1,076Updated 2 weeks ago
- Evaluate your LLM's response with Prometheus and GPT4 💯☆745Updated last week
- System 2 Reasoning Link Collection☆597Updated this week
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆774Updated 3 weeks ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆790Updated last week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆662Updated last month
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.☆1,542Updated last week
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆1,451Updated last month
- ☆449Updated 5 months ago
- Automated Design of Agentic Systems☆846Updated 3 weeks ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆1,935Updated last week
- ☆640Updated this week
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆792Updated last month
- Guide for fine-tuning Llama/Mistral/CodeLlama models and more☆521Updated 3 weeks ago
- Generate textbook-quality synthetic LLM pretraining data☆479Updated 11 months ago
- ☆1,161Updated 3 weeks ago
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,378Updated 2 months ago
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆657Updated 5 months ago
- Minimalistic large language model 3D-parallelism training☆1,111Updated this week
- Agentless🐱: an agentless approach to automatically solve software development problems☆663Updated 3 weeks ago
- Optimizing inference proxy for LLMs☆406Updated this week
- ☆442Updated 3 weeks ago
- Best practices for distilling large language models.☆370Updated 7 months ago
- The code used to train and run inference with the ColPali architecture.☆502Updated this week