Jellyfish042 / uncheatable_evalLinks

Evaluating LLMs with Dynamic Data

☆99

Alternatives and similar repositories for uncheatable_eval

Users that are interested in uncheatable_eval are comparing it to the libraries listed below

Sorting:

dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆205Updated last year
RWKV / RWKV-infctx-trainer
RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!
☆147Updated last year
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆202Updated last year
GAIR-NLP / Entropy-ABF
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
☆81Updated last year
Glaciohound / LM-Infinite
Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆152Updated 8 months ago
wuhy68 / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)
☆147Updated last year
FasterDecoding / REST
REST: Retrieval-Based Speculative Decoding, NAACL 2024
☆212Updated 2 months ago
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 9 months ago
SmerkyG / RWKV_Explained
RWKV, in easy to read code
☆72Updated 8 months ago
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆141Updated 2 years ago
thu-ml / low-bit-optimizers
Low-bit optimizers for PyTorch
☆133Updated 2 years ago
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆162Updated 7 months ago
llm-random / llm-random
☆205Updated last week
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆127Updated last year
SmerkyG / gptcore
Fast modular code to create and train cutting edge LLMs
☆68Updated last year
jaymody / speculative-sampling
Simple implementation of Speculative Sampling in NumPy for GPT-2.
☆98Updated 2 years ago
whyNLP / LCKV
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…
☆157Updated 7 months ago
HazyResearch / based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆243Updated 6 months ago
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆127Updated 2 years ago
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆60Updated last year
Zyphra / Zyda_processing
☆39Updated last year
liyucheng09 / llm-compressive
Longitudinal Evaluation of LLMs via Data Compression
☆33Updated last year
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆101Updated last year
GeneZC / MiniMA
Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
☆102Updated last year
princeton-pli / MeCo
Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"
☆48Updated 5 months ago
LLM360 / amber-train
Pre-training code for Amber 7B LLM
☆169Updated last year
Joluck / RWKV-PEFT
☆157Updated 2 weeks ago
kyegomez / phi-1
Plug in and play implementation of " Textbooks Are All You Need", ready for training, inference, and dataset generation
☆74Updated 2 years ago
kernelmachine / cbtm
Code repository for the c-BTM paper
☆108Updated 2 years ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year