bjoernpl / GermanBenchmark
A repository containing the code for translating popular LLM benchmarks to German.
☆22Updated last year
Related projects: ⓘ
- A framework for few-shot evaluation of autoregressive language models.☆13Updated 7 months ago
- Code repository for the c-BTM paper☆105Updated 11 months ago
- Code for Zero-Shot Tokenizer Transfer☆109Updated 2 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆38Updated 3 weeks ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆217Updated 2 months ago
- ☆73Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆118Updated 6 months ago
- ☆75Updated 3 weeks ago
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆139Updated last month
- Prune transformer layers☆60Updated 3 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆170Updated last month
- ☆38Updated 5 months ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆51Updated 3 months ago
- ☆69Updated 4 months ago
- A simple unified framework for evaluating LLMs☆121Updated this week
- Experiments with generating opensource language model assistants☆97Updated last year
- Experiments for efforts to train a new and improved t5☆76Updated 5 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆182Updated last month
- Functional Benchmarks and the Reasoning Gap☆74Updated last month
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆109Updated 11 months ago
- Language models scale reliably with over-training and on downstream tasks☆91Updated 5 months ago
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆68Updated last year
- ☆92Updated last year
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆177Updated 4 months ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆73Updated last month
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆91Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆82Updated 6 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆51Updated last month
- The official repo for "LLoCo: Learning Long Contexts Offline"☆104Updated 3 months ago